Search CORE

20,959 research outputs found

Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments

Author: Geiger Jürgen
Jin Wenyu
Mousa Amr El-Desoky
Pohjalainen Jouni
Schuller Björn
Zhang Zixing
Publication venue
Publication date: 01/01/2018
Field of study

Eliminating the negative effect of non-stationary environmental noise is a long-standing research topic for automatic speech recognition that stills remains an important challenge. Data-driven supervised approaches, including ones based on deep neural networks, have recently emerged as potential alternatives to traditional unsupervised approaches and with sufficient training, can alleviate the shortcomings of the unsupervised methods in various real-life acoustic environments. In this light, we review recently developed, representative deep learning approaches for tackling non-stationary additive and convolutional degradation of speech with the aim of providing guidelines for those involved in the development of environmentally robust speech recognition systems. We separately discuss single- and multi-channel techniques developed for the front-end and back-end of speech recognition systems, as well as joint front-end and back-end training frameworks

arXiv.org e-Print Archive

OPUS Augsburg

Spectro-temporal post-enhancement using MMSE estimation in NMF based single-channel source separation

Author: Erdogan Hakan
Erdoğan Hakan
Grais Emad Mounir
Publication venue: ISCA (International Speech Communication Association)
Publication date: 01/01/2013
Field of study

We propose to use minimum mean squared error (MMSE) estimates to enhance the signals that are separated by nonnegative matrix factorization (NMF). In single channel source separation (SCSS), NMF is used to train a set of basis vectors for each source from their training spectrograms. Then NMF is used to decompose the mixed signal spectrogram as a weighted linear combination of the trained basis vectors from which estimates of each corresponding source can be obtained. In this work, we deal with the spectrogram of each separated signal as a 2D distorted signal that needs to be restored. A multiplicative distortion model is assumed where the logarithm of the true signal distribution is modeled with a Gaussian mixture model (GMM) and the distortion is modeled as having a log-normal distribution. The parameters of the GMM are learned from training data whereas the distortion parameters are learned online from each separated signal. The initial source estimates are improved and replaced with their MMSE estimates under this new probabilistic framework. The experimental results show that using the proposed MMSE estimation technique as a post enhancement after NMF improves the quality of the separated signal

CiteSeerX

Sabanci University Research Database

Sparse Signal Processing Concepts for Efficient 5G System Design

Author: Boche Holger
Jung Peter
Strohmer Thomas
Wunder Gerhard
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/01/2015
Field of study

As it becomes increasingly apparent that 4G will not be able to meet the emerging demands of future mobile communication systems, the question what could make up a 5G system, what are the crucial challenges and what are the key drivers is part of intensive, ongoing discussions. Partly due to the advent of compressive sensing, methods that can optimally exploit sparsity in signals have received tremendous attention in recent years. In this paper we will describe a variety of scenarios in which signal sparsity arises naturally in 5G wireless systems. Signal sparsity and the associated rich collection of tools and algorithms will thus be a viable source for innovation in 5G wireless system design. We will discribe applications of this sparse signal processing paradigm in MIMO random access, cloud radio access networks, compressive channel-source network coding, and embedded security. We will also emphasize important open problem that may arise in 5G system design, for which sparsity will potentially play a key role in their solution.Comment: 18 pages, 5 figures, accepted for publication in IEEE Acces

arXiv.org e-Print Archive

Fraunhofer-ePrints

Blind Curvelet based Denoising of Seismic Surveys in Coherent and Incoherent Noise Environments

Author: AlRegib Ghassan
Deriche Mohamed
Iqbal Naveed
Publication venue
Publication date: 28/10/2018
Field of study

The localized nature of curvelet functions, together with their frequency and dip characteristics, makes the curvelet transform an excellent choice for processing seismic data. In this work, a denoising method is proposed based on a combination of the curvelet transform and a whitening filter along with procedure for noise variance estimation. The whitening filter is added to get the best performance of the curvelet transform under coherent and incoherent correlated noise cases, and furthermore, it simplifies the noise estimation method and makes it easy to use the standard threshold methodology without digging into the curvelet domain. The proposed method is tested on pseudo-synthetic data by adding noise to real noise-less data set of the Netherlands offshore F3 block and on the field data set from east Texas, USA, containing ground roll noise. Our experimental results show that the proposed algorithm can achieve the best results under all types of noises (incoherent or uncorrelated or random, and coherent noise)

arXiv.org e-Print Archive