Search CORE

4,975 research outputs found

Performance evaluation of channel estimation techniques for a proposed '4G' MC-CDMA based system in a time varying channel

Author: Armour SMD
Cooper MA
Dowler ASH
McGeehan JP
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2004
Field of study

Block-Online Multi-Channel Speech Enhancement Using DNN-Supported Relative Transfer Function Estimates

Author: Bohac Marek
Koldovsky Zbynek
Malek Jiri
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 11/12/2019
Field of study

This work addresses the problem of block-online processing for multi-channel speech enhancement. Such processing is vital in scenarios with moving speakers and/or when very short utterances are processed, e.g., in voice assistant scenarios. We consider several variants of a system that performs beamforming supported by DNN-based voice activity detection (VAD) followed by post-filtering. The speaker is targeted through estimating relative transfer functions between microphones. Each block of the input signals is processed independently in order to make the method applicable in highly dynamic environments. Owing to the short length of the processed block, the statistics required by the beamformer are estimated less precisely. The influence of this inaccuracy is studied and compared to the processing regime when recordings are treated as one block (batch processing). The experimental evaluation of the proposed method is performed on large datasets of CHiME-4 and on another dataset featuring moving target speaker. The experiments are evaluated in terms of objective and perceptual criteria (such as signal-to-interference ratio (SIR) or perceptual evaluation of speech quality (PESQ), respectively). Moreover, word error rate (WER) achieved by a baseline automatic speech recognition system is evaluated, for which the enhancement method serves as a front-end solution. The results indicate that the proposed method is robust with respect to short length of the processed block. Significant improvements in terms of the criteria and WER are observed even for the block length of 250 ms.Comment: 10 pages, 8 figures, 4 tables. Modified version of the article accepted for publication in IET Signal Processing journal. Original results unchanged, additional experiments presented, refined discussion and conclusion

arXiv.org e-Print Archive

DSpace@TUL

Rank-1 Constrained Multichannel Wiener Filter for Speech Recognition in Noisy Environments

Author: Serizel Romain
Vincent Emmanuel
Wang Ziteng
Yan Yonghong
Publication venue
Publication date: 14/11/2017
Field of study

Multichannel linear filters, such as the Multichannel Wiener Filter (MWF) and the Generalized Eigenvalue (GEV) beamformer are popular signal processing techniques which can improve speech recognition performance. In this paper, we present an experimental study on these linear filters in a specific speech recognition task, namely the CHiME-4 challenge, which features real recordings in multiple noisy environments. Specifically, the rank-1 MWF is employed for noise reduction and a new constant residual noise power constraint is derived which enhances the recognition performance. To fulfill the underlying rank-1 assumption, the speech covariance matrix is reconstructed based on eigenvectors or generalized eigenvectors. Then the rank-1 constrained MWF is evaluated with alternative multichannel linear filters under the same framework, which involves a Bidirectional Long Short-Term Memory (BLSTM) network for mask estimation. The proposed filter outperforms alternative ones, leading to a 40% relative Word Error Rate (WER) reduction compared with the baseline Weighted Delay and Sum (WDAS) beamformer on the real test set, and a 15% relative WER reduction compared with the GEV-BAN method. The results also suggest that the speech recognition accuracy correlates more with the Mel-frequency cepstral coefficients (MFCC) feature variance than with the noise reduction or the speech distortion level.Comment: for Computer Speech and Languag

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Dynamic Models and Nonlinear Filtering of Wave Propagation in Random Fields

Author: Wei Haiqing
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 01/01/2004
Field of study

In this paper, a general model of wireless channels is established based on the physics of wave propagation. Then the problems of inverse scattering and channel prediction are formulated as nonlinear filtering problems. The solutions to the nonlinear filtering problems are given in the form of dynamic evolution equations of the estimated quantities. Finally, examples are provided to illustrate the practical applications of the proposed theory.Comment: 12 pages, 1 figur

arXiv.org e-Print Archive

CiteSeerX

Crossref

CERN Document Server

Visualization on colour based flow vector of thermal image for movement detection during interactive session

Author: Harun Mat
Lina Farhana Mahadi
Mohamad Nurazmi Raman
Nabilah Ibrahim
Sari Suhaila
Wan Nurshazwani Wan Zakaria
Publication venue: 'IOP Publishing'
Publication date: 01/01/2018
Field of study

Recently thermal imaging is exploited in applications such as motion and face detection. It has drawn attention many researchers to build such technology to improve lifestyle. This work proposed a technique to detect and identify a motion in sequence images for the application in security monitoring system or outdoor surveillance. Conventional system might cause false information with the present of shadow. Thus, methods employed in this work are Canny edge detector method, Lucas Kanade and Horn Shunck algorithms, to overcome the major problem when using thresholding method, which is only intensity or pixel magnitude is considered instead of relationships between the pixels. The results obtained could be observed in flow vector parameter and the segmentation colour based image for the time frame from 1 to 10 seconds. The visualization of both the parameters clarified the movement and changes of pixel intensity between two frames by the supportive colour segmentation, either in smooth or rough motion. Thus, this technique may contribute to others application such as biometrics, military system, and surveillance machine

UTHM Institutional Repository

Crossref