Search CORE

3,914 research outputs found

Structured Sparsity Models for Multiparty Speech Recovery from Reverberant Recordings

Author: Asaei Afsaneh
Bourlard Hervé
Cevher Volkan
Golbabaee Mohammad
Publication venue
Publication date: 01/01/2012
Field of study

We tackle the multi-party speech recovery problem through modeling the acoustic of the reverberant chambers. Our approach exploits structured sparsity models to perform room modeling and speech recovery. We propose a scheme for characterizing the room acoustic from the unknown competing speech sources relying on localization of the early images of the speakers by sparse approximation of the spatial spectra of the virtual sources in a free-space model. The images are then clustered exploiting the low-rank structure of the spectro-temporal components belonging to each source. This enables us to identify the early support of the room impulse response function and its unique map to the room geometry. To further tackle the ambiguity of the reflection ratios, we propose a novel formulation of the reverberation model and estimate the absorption coefficients through a convex optimization exploiting joint sparsity model formulated upon spatio-spectral sparsity of concurrent speech representation. The acoustic parameters are then incorporated for separating individual speech signals through either structured sparse recovery or inverse filtering the acoustic channels. The experiments conducted on real data recordings demonstrate the effectiveness of the proposed approach for multi-party speech recovery and recognition.Comment: 31 page

arXiv.org e-Print Archive

Edinburgh Research Explorer

Low-rank and Sparse Soft Targets to Learn Better DNN Acoustic Models

Author: Asaei Afsaneh
Bourlard Herve
Dighe Pranay
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/10/2016
Field of study

Conventional deep neural networks (DNN) for speech acoustic modeling rely on Gaussian mixture models (GMM) and hidden Markov model (HMM) to obtain binary class labels as the targets for DNN training. Subword classes in speech recognition systems correspond to context-dependent tied states or senones. The present work addresses some limitations of GMM-HMM senone alignments for DNN training. We hypothesize that the senone probabilities obtained from a DNN trained with binary labels can provide more accurate targets to learn better acoustic models. However, DNN outputs bear inaccuracies which are exhibited as high dimensional unstructured noise, whereas the informative components are structured and low-dimensional. We exploit principle component analysis (PCA) and sparse coding to characterize the senone subspaces. Enhanced probabilities obtained from low-rank and sparse reconstructions are used as soft-targets for DNN acoustic modeling, that also enables training with untranscribed data. Experiments conducted on AMI corpus shows 4.6% relative reduction in word error rate

arXiv.org e-Print Archive

Crossref

Exploiting Prior Knowledge in Compressed Sensing Wireless ECG Systems

Author: Barner Kenneth E.
Blanco-Velasco Manuel
Carrillo Rafael E.
Polania Luisa F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/05/2014
Field of study

Recent results in telecardiology show that compressed sensing (CS) is a promising tool to lower energy consumption in wireless body area networks for electrocardiogram (ECG) monitoring. However, the performance of current CS-based algorithms, in terms of compression rate and reconstruction quality of the ECG, still falls short of the performance attained by state-of-the-art wavelet based algorithms. In this paper, we propose to exploit the structure of the wavelet representation of the ECG signal to boost the performance of CS-based methods for compression and reconstruction of ECG signals. More precisely, we incorporate prior information about the wavelet dependencies across scales into the reconstruction algorithms and exploit the high fraction of common support of the wavelet coefficients of consecutive ECG segments. Experimental results utilizing the MIT-BIH Arrhythmia Database show that significant performance gains, in terms of compression rate and reconstruction quality, can be obtained by the proposed algorithms compared to current CS-based methods.Comment: Accepted for publication at IEEE Journal of Biomedical and Health Informatic

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Bayesian Hypothesis Testing for Block Sparse Signal Recovery

Author: Korki Mehdi
Zayyani Hadi
Zhang Jingxin
Publication venue
Publication date: 22/08/2015
Field of study

This letter presents a novel Block Bayesian Hypothesis Testing Algorithm (Block-BHTA) for reconstructing block sparse signals with unknown block structures. The Block-BHTA comprises the detection and recovery of the supports, and the estimation of the amplitudes of the block sparse signal. The support detection and recovery is performed using a Bayesian hypothesis testing. Then, based on the detected and reconstructed supports, the nonzero amplitudes are estimated by linear MMSE. The effectiveness of Block-BHTA is demonstrated by numerical experiments.Comment: 5 pages, 2 figures. arXiv admin note: text overlap with arXiv:1412.231

arXiv.org e-Print Archive

Swinburne Research Bank