25 research outputs found
Non-intrusive speech quality prediction using modulation energies and LSTM-network
Many signal processing algorithms have been proposed to improve the quality of speech recorded in the presence of noise and reverberation. Perceptual measures, i.e., listening tests, are usually considered the most reliable way to evaluate the quality of speech processed by such algorithms but are costly and time-consuming. Consequently, speech enhancement algorithms are often evaluated using signal-based measures, which can be either intrusive or non-intrusive. As the computation of intrusive measures requires a reference signal, only non-intrusive measures can be used in applications for which the clean speech signal is not available. However, many existing non-intrusive measures correlate poorly with the perceived speech quality, particularly when applied over a wide range of algorithms or acoustic conditions. In this paper, we propose a novel non-intrusive measure of the quality of processed speech that combines modulation energy features and a recurrent neural network using long short-term memory cells. We collected a dataset of perceptually evaluated signals representing several acoustic conditions and algorithms and used this dataset to train and evaluate the proposed measure. Results show that the proposed measure yields higher correlation with perceptual speech quality than that of benchmark intrusive and non-intrusive measures when considering various categories of algorithms. Although the proposed measure is sensitive to mismatch between training and testing, results show that it is a useful approach to evaluate specific algorithms over a wide range of acoustic conditions and may, thus, become particularly useful for real-time selection of speech enhancement algorithm settings
Neutral atoms in ionic lattices: Stability and ground state properties of KCl:Ag(0)
The equilibrium geometry of Ag0 centers formed at cation sites in KCl has been investigated by means of
total-energy calculations carried out on clusters of different sizes. Two distinct methods have been employed:
First, an ab initio wave-function based method on embedded clusters and second, density-functional theory
~DFT! methods on clusters in vacuo involving up to 117 atoms. In the ab initio calculations the obtained
equilibrium Ag0
-Cl2 distance Re is 3.70 Ã…, implying a large outward relaxation of 18%, along with 7%
relaxation for the distance between Ag0 and the first K1 ions in ^100& directions. A very similar result is
reached through DFT with a 39-atom cluster. Both approaches lead to a rather shallow minimum of the
total-energy surface, the associated force constant of the A1g mode is several times smaller than that found for
other impurities in halides. These conclusions are shown to be compatible with available experimental results.
The shallow minimum is not clearly seen in DFT calculations with larger clusters. The unpaired electron
density on silver and Cl ligands has been calculated as function of the metal-ligand distance and has been
compared with values derived from electron-paramagnetic resonance data. The DFT calculations for all cluster
sizes indicate that the experimental hyperfine and superhyperfine constants are compatible when Re is close to
3.70 Ã…. The important relation between the electronic stability of a neutral atom inside an ionic lattice and the
local relaxation is established through a simple electrostatic model. As most remarkable features it is shown
that ~i! the cationic Ag0 center is not likely to be formed inside AgCl, ~ii! in the Ag0 center encountered in
SrCl2, the silver atom is probably located at an anion site, and ~iii! the properties of a center-like KCl:Ag0
would experience significant changes under hydrostatic pressures of the order of 6 GPa
Blind Suppression of Nonstationary Diffuse Acoustic Noise Based on Spatial Covariance Matrix Decomposition
International audienceWe propose methods for blind suppression of nonstationary diffuse noise based on decomposition of the observed spatial covariance matrix into signal and noise parts. In modeling noise to regularize the ill-posed decomposition problem, we exploit spatial invariance (isotropy) instead of temporal invariance (stationarity). The isotropy assumption is that the spatial cross-spectrum of noise is dependent on the distance between microphones and independent of the direction between them. We propose methods for spatial covariance matrix decomposition based on least squares and maximum likelihood estimation. The methods are validated on real-world recordings