1,116 research outputs found

    A Study into Speech Enhancement Techniques in Adverse Environment

    Get PDF
    This dissertation developed speech enhancement techniques that improve the speech quality in applications such as mobile communications, teleconferencing and smart loudspeakers. For these applications it is necessary to suppress noise and reverberation. Thus the contribution in this dissertation is twofold: single channel speech enhancement system which exploits the temporal and spectral diversity of the received microphone signal for noise suppression and multi-channel speech enhancement method with the ability to employ spatial diversity to reduce reverberation

    The Linear Model under Mixed Gaussian Inputs: Designing the Transfer Matrix

    Full text link
    Suppose a linear model y = Hx + n, where inputs x, n are independent Gaussian mixtures. The problem is to design the transfer matrix H so as to minimize the mean square error (MSE) when estimating x from y. This problem has important applications, but faces at least three hurdles. Firstly, even for a fixed H, the minimum MSE (MMSE) has no analytical form. Secondly, the MMSE is generally not convex in H. Thirdly, derivatives of the MMSE w.r.t. H are hard to obtain. This paper casts the problem as a stochastic program and invokes gradient methods. The study is motivated by two applications in signal processing. One concerns the choice of error-reducing precoders; the other deals with selection of pilot matrices for channel estimation. In either setting, our numerical results indicate improved estimation accuracy - markedly better than those obtained by optimal design based on standard linear estimators. Some implications of the non-convexities of the MMSE are noteworthy, yet, to our knowledge, not well known. For example, there are cases in which more pilot power is detrimental for channel estimation. This paper explains why

    Denoising Deep Neural Networks Based Voice Activity Detection

    Full text link
    Recently, the deep-belief-networks (DBN) based voice activity detection (VAD) has been proposed. It is powerful in fusing the advantages of multiple features, and achieves the state-of-the-art performance. However, the deep layers of the DBN-based VAD do not show an apparent superiority to the shallower layers. In this paper, we propose a denoising-deep-neural-network (DDNN) based VAD to address the aforementioned problem. Specifically, we pre-train a deep neural network in a special unsupervised denoising greedy layer-wise mode, and then fine-tune the whole network in a supervised way by the common back-propagation algorithm. In the pre-training phase, we take the noisy speech signals as the visible layer and try to extract a new feature that minimizes the reconstruction cross-entropy loss between the noisy speech signals and its corresponding clean speech signals. Experimental results show that the proposed DDNN-based VAD not only outperforms the DBN-based VAD but also shows an apparent performance improvement of the deep layers over shallower layers.Comment: This paper has been accepted by IEEE ICASSP-2013, and will be published online after May, 201

    Atomic norm denoising with applications to line spectral estimation

    Get PDF
    Motivated by recent work on atomic norms in inverse problems, we propose a new approach to line spectral estimation that provides theoretical guarantees for the mean-squared-error (MSE) performance in the presence of noise and without knowledge of the model order. We propose an abstract theory of denoising with atomic norms and specialize this theory to provide a convex optimization problem for estimating the frequencies and phases of a mixture of complex exponentials. We show that the associated convex optimization problem can be solved in polynomial time via semidefinite programming (SDP). We also show that the SDP can be approximated by an l1-regularized least-squares problem that achieves nearly the same error rate as the SDP but can scale to much larger problems. We compare both SDP and l1-based approaches with classical line spectral analysis methods and demonstrate that the SDP outperforms the l1 optimization which outperforms MUSIC, Cadzow's, and Matrix Pencil approaches in terms of MSE over a wide range of signal-to-noise ratios.Comment: 27 pages, 10 figures. A preliminary version of this work appeared in the Proceedings of the 49th Annual Allerton Conference in September 2011. Numerous numerical experiments added to this version in accordance with suggestions by anonymous reviewer
    corecore