3,914 research outputs found
Structured Sparsity Models for Multiparty Speech Recovery from Reverberant Recordings
We tackle the multi-party speech recovery problem through modeling the
acoustic of the reverberant chambers. Our approach exploits structured sparsity
models to perform room modeling and speech recovery. We propose a scheme for
characterizing the room acoustic from the unknown competing speech sources
relying on localization of the early images of the speakers by sparse
approximation of the spatial spectra of the virtual sources in a free-space
model. The images are then clustered exploiting the low-rank structure of the
spectro-temporal components belonging to each source. This enables us to
identify the early support of the room impulse response function and its unique
map to the room geometry. To further tackle the ambiguity of the reflection
ratios, we propose a novel formulation of the reverberation model and estimate
the absorption coefficients through a convex optimization exploiting joint
sparsity model formulated upon spatio-spectral sparsity of concurrent speech
representation. The acoustic parameters are then incorporated for separating
individual speech signals through either structured sparse recovery or inverse
filtering the acoustic channels. The experiments conducted on real data
recordings demonstrate the effectiveness of the proposed approach for
multi-party speech recovery and recognition.Comment: 31 page
Low-rank and Sparse Soft Targets to Learn Better DNN Acoustic Models
Conventional deep neural networks (DNN) for speech acoustic modeling rely on
Gaussian mixture models (GMM) and hidden Markov model (HMM) to obtain binary
class labels as the targets for DNN training. Subword classes in speech
recognition systems correspond to context-dependent tied states or senones. The
present work addresses some limitations of GMM-HMM senone alignments for DNN
training. We hypothesize that the senone probabilities obtained from a DNN
trained with binary labels can provide more accurate targets to learn better
acoustic models. However, DNN outputs bear inaccuracies which are exhibited as
high dimensional unstructured noise, whereas the informative components are
structured and low-dimensional. We exploit principle component analysis (PCA)
and sparse coding to characterize the senone subspaces. Enhanced probabilities
obtained from low-rank and sparse reconstructions are used as soft-targets for
DNN acoustic modeling, that also enables training with untranscribed data.
Experiments conducted on AMI corpus shows 4.6% relative reduction in word error
rate
Exploiting Prior Knowledge in Compressed Sensing Wireless ECG Systems
Recent results in telecardiology show that compressed sensing (CS) is a
promising tool to lower energy consumption in wireless body area networks for
electrocardiogram (ECG) monitoring. However, the performance of current
CS-based algorithms, in terms of compression rate and reconstruction quality of
the ECG, still falls short of the performance attained by state-of-the-art
wavelet based algorithms. In this paper, we propose to exploit the structure of
the wavelet representation of the ECG signal to boost the performance of
CS-based methods for compression and reconstruction of ECG signals. More
precisely, we incorporate prior information about the wavelet dependencies
across scales into the reconstruction algorithms and exploit the high fraction
of common support of the wavelet coefficients of consecutive ECG segments.
Experimental results utilizing the MIT-BIH Arrhythmia Database show that
significant performance gains, in terms of compression rate and reconstruction
quality, can be obtained by the proposed algorithms compared to current
CS-based methods.Comment: Accepted for publication at IEEE Journal of Biomedical and Health
Informatic
Bayesian Hypothesis Testing for Block Sparse Signal Recovery
This letter presents a novel Block Bayesian Hypothesis Testing Algorithm
(Block-BHTA) for reconstructing block sparse signals with unknown block
structures. The Block-BHTA comprises the detection and recovery of the
supports, and the estimation of the amplitudes of the block sparse signal. The
support detection and recovery is performed using a Bayesian hypothesis
testing. Then, based on the detected and reconstructed supports, the nonzero
amplitudes are estimated by linear MMSE. The effectiveness of Block-BHTA is
demonstrated by numerical experiments.Comment: 5 pages, 2 figures. arXiv admin note: text overlap with
arXiv:1412.231
- …