1,116 research outputs found
A Study into Speech Enhancement Techniques in Adverse Environment
This dissertation developed speech enhancement techniques that improve the speech quality in applications such as mobile communications, teleconferencing and smart loudspeakers. For these applications it is necessary to suppress noise and reverberation. Thus the contribution in this dissertation is twofold: single channel speech enhancement system which exploits the temporal and spectral diversity of the received microphone signal for noise suppression and multi-channel speech enhancement method with the ability to employ spatial diversity to reduce reverberation
The Linear Model under Mixed Gaussian Inputs: Designing the Transfer Matrix
Suppose a linear model y = Hx + n, where inputs x, n are independent Gaussian
mixtures. The problem is to design the transfer matrix H so as to minimize the
mean square error (MSE) when estimating x from y. This problem has important
applications, but faces at least three hurdles. Firstly, even for a fixed H,
the minimum MSE (MMSE) has no analytical form. Secondly, the MMSE is generally
not convex in H. Thirdly, derivatives of the MMSE w.r.t. H are hard to obtain.
This paper casts the problem as a stochastic program and invokes gradient
methods. The study is motivated by two applications in signal processing. One
concerns the choice of error-reducing precoders; the other deals with selection
of pilot matrices for channel estimation. In either setting, our numerical
results indicate improved estimation accuracy - markedly better than those
obtained by optimal design based on standard linear estimators. Some
implications of the non-convexities of the MMSE are noteworthy, yet, to our
knowledge, not well known. For example, there are cases in which more pilot
power is detrimental for channel estimation. This paper explains why
Denoising Deep Neural Networks Based Voice Activity Detection
Recently, the deep-belief-networks (DBN) based voice activity detection (VAD)
has been proposed. It is powerful in fusing the advantages of multiple
features, and achieves the state-of-the-art performance. However, the deep
layers of the DBN-based VAD do not show an apparent superiority to the
shallower layers. In this paper, we propose a denoising-deep-neural-network
(DDNN) based VAD to address the aforementioned problem. Specifically, we
pre-train a deep neural network in a special unsupervised denoising greedy
layer-wise mode, and then fine-tune the whole network in a supervised way by
the common back-propagation algorithm. In the pre-training phase, we take the
noisy speech signals as the visible layer and try to extract a new feature that
minimizes the reconstruction cross-entropy loss between the noisy speech
signals and its corresponding clean speech signals. Experimental results show
that the proposed DDNN-based VAD not only outperforms the DBN-based VAD but
also shows an apparent performance improvement of the deep layers over
shallower layers.Comment: This paper has been accepted by IEEE ICASSP-2013, and will be
published online after May, 201
Atomic norm denoising with applications to line spectral estimation
Motivated by recent work on atomic norms in inverse problems, we propose a
new approach to line spectral estimation that provides theoretical guarantees
for the mean-squared-error (MSE) performance in the presence of noise and
without knowledge of the model order. We propose an abstract theory of
denoising with atomic norms and specialize this theory to provide a convex
optimization problem for estimating the frequencies and phases of a mixture of
complex exponentials. We show that the associated convex optimization problem
can be solved in polynomial time via semidefinite programming (SDP). We also
show that the SDP can be approximated by an l1-regularized least-squares
problem that achieves nearly the same error rate as the SDP but can scale to
much larger problems. We compare both SDP and l1-based approaches with
classical line spectral analysis methods and demonstrate that the SDP
outperforms the l1 optimization which outperforms MUSIC, Cadzow's, and Matrix
Pencil approaches in terms of MSE over a wide range of signal-to-noise ratios.Comment: 27 pages, 10 figures. A preliminary version of this work appeared in
the Proceedings of the 49th Annual Allerton Conference in September 2011.
Numerous numerical experiments added to this version in accordance with
suggestions by anonymous reviewer
- …