588 research outputs found
Cepstral analysis based on the Glimpse proportion measure for improving the intelligibility of HMM-based synthetic speech in noise
In this paper we introduce a new cepstral coefficient extraction method based on an intelligibility measure for speech in noise, the Glimpse Proportion measure. This new method aims to increase the intelligibility of speech in noise by modifying the clean speech, and has applications in scenarios such as public announcement and car navigation systems. We first explain how the Glimpse Proportion measure operates and further show how we approximated it to integrate it into an existing spectral envelope parameter extraction method commonly used in the HMM-based speech synthesis framework. We then demonstrate how this new method changes the modelled spectrum according to the characteristics of the noise and show results for a listening test with vocoded and HMM-based synthetic speech. The test indicates that the proposed method can significantly improve intelligibility of synthetic speech in speech shaped noise. Index Terms — cepstral coefficient extraction, objective measure for speech intelligibility, Lombard speech, HMM-based speech synthesis 1
Multi-Cell Random Beamforming: Achievable Rate and Degrees of Freedom Region
Random beamforming (RBF) is a practically favourable transmission scheme for
multiuser multi-antenna downlink systems since it requires only partial channel
state information (CSI) at the transmitter. Under the conventional single-cell
setup, RBF is known to achieve the optimal sum-capacity scaling law as the
number of users goes to infinity, thanks to the multiuser diversity enabled
transmission scheduling that virtually eliminates the intra-cell interference.
In this paper, we extend the study of RBF to a more practical multi-cell
downlink system with single-antenna receivers subject to the additional
inter-cell interference (ICI). First, we consider the case of finite
signal-to-noise ratio (SNR) at each receiver. We derive a closed-form
expression of the achievable sum-rate with the multi-cell RBF, based upon which
we show an asymptotic sum-rate scaling law as the number of users goes to
infinity. Next, we consider the high-SNR regime and for tractable analysis
assume that the number of users in each cell scales in a certain order with the
per-cell SNR. Under this setup, we characterize the achievable degrees of
freedom (DoF) for the single-cell case with RBF. Then we extend the analysis to
the multi-cell RBF case by characterizing the DoF region. It is shown that the
DoF region characterization provides useful guideline on how to design a
cooperative multi-cell RBF system to achieve optimal throughput tradeoffs among
different cells. Furthermore, our results reveal that the multi-cell RBF scheme
achieves the "interference-free DoF" region upper bound for the multi-cell
system, provided that the per-cell number of users has a sufficiently large
scaling order with the SNR. Our result thus confirms the optimality of
multi-cell RBF in this regime even without the complete CSI at the transmitter,
as compared to other full-CSI requiring transmission schemes such as
interference alignment.Comment: 28 pages, 6 figures, to appear in IEEE Transactions of Signal
Processing. This work was presented in part at IEEE International Conference
on Acoustics, Speech, and Signal Processing (ICASSP), Kyoto, Japan, March
25-30, 2012. The authors are with the Department of Electrical and Computer
Engineering, National University of Singapore (emails: {hieudn, elezhang,
elehht}@nus.edu.sg
Sparse Distributed Learning Based on Diffusion Adaptation
This article proposes diffusion LMS strategies for distributed estimation
over adaptive networks that are able to exploit sparsity in the underlying
system model. The approach relies on convex regularization, common in
compressive sensing, to enhance the detection of sparsity via a diffusive
process over the network. The resulting algorithms endow networks with learning
abilities and allow them to learn the sparse structure from the incoming data
in real-time, and also to track variations in the sparsity of the model. We
provide convergence and mean-square performance analysis of the proposed method
and show under what conditions it outperforms the unregularized diffusion
version. We also show how to adaptively select the regularization parameter.
Simulation results illustrate the advantage of the proposed filters for sparse
data recovery.Comment: to appear in IEEE Trans. on Signal Processing, 201
Proximal Multitask Learning over Networks with Sparsity-inducing Coregularization
In this work, we consider multitask learning problems where clusters of nodes
are interested in estimating their own parameter vector. Cooperation among
clusters is beneficial when the optimal models of adjacent clusters have a good
number of similar entries. We propose a fully distributed algorithm for solving
this problem. The approach relies on minimizing a global mean-square error
criterion regularized by non-differentiable terms to promote cooperation among
neighboring clusters. A general diffusion forward-backward splitting strategy
is introduced. Then, it is specialized to the case of sparsity promoting
regularizers. A closed-form expression for the proximal operator of a weighted
sum of -norms is derived to achieve higher efficiency. We also provide
conditions on the step-sizes that ensure convergence of the algorithm in the
mean and mean-square error sense. Simulations are conducted to illustrate the
effectiveness of the strategy
Noise cancellation over spatial regions using adaptive wave domain processing
This paper proposes wave-domain adaptive processing for noise cancellation within a large spatial region. We use fundamental solutions of the Helmholtz wave-equation as basis functions to express the noise field over a spatial region and show the wave-domain processing directly on the decomposition coefficients to control the entire region. A feedback control system is implemented, where only a single microphone array is placed at the boundary of the control region to measure the residual signals, and a loudspeaker array is used to generate the anti-noise signals. We develop the adaptive wave-domain filtered-x least mean square algorithm. Simulation results show that using the proposed method the noise over the entire control region can be significantly reduced with fast convergence in both free-field and reverberant environmentsThanks to Australian Research Councils Discovery Projects funding
scheme (project no. DP140103412). The work of J. Zhang was sponsored
by the China Scholarship Council with the Australian National University
Auditory processing-based features for improving speech recognition in adverse acoustic conditions
n/
- …