25 research outputs found

    Non-intrusive speech quality prediction using modulation energies and LSTM-network

    Get PDF
    Many signal processing algorithms have been proposed to improve the quality of speech recorded in the presence of noise and reverberation. Perceptual measures, i.e., listening tests, are usually considered the most reliable way to evaluate the quality of speech processed by such algorithms but are costly and time-consuming. Consequently, speech enhancement algorithms are often evaluated using signal-based measures, which can be either intrusive or non-intrusive. As the computation of intrusive measures requires a reference signal, only non-intrusive measures can be used in applications for which the clean speech signal is not available. However, many existing non-intrusive measures correlate poorly with the perceived speech quality, particularly when applied over a wide range of algorithms or acoustic conditions. In this paper, we propose a novel non-intrusive measure of the quality of processed speech that combines modulation energy features and a recurrent neural network using long short-term memory cells. We collected a dataset of perceptually evaluated signals representing several acoustic conditions and algorithms and used this dataset to train and evaluate the proposed measure. Results show that the proposed measure yields higher correlation with perceptual speech quality than that of benchmark intrusive and non-intrusive measures when considering various categories of algorithms. Although the proposed measure is sensitive to mismatch between training and testing, results show that it is a useful approach to evaluate specific algorithms over a wide range of acoustic conditions and may, thus, become particularly useful for real-time selection of speech enhancement algorithm settings

    Neutral atoms in ionic lattices: Stability and ground state properties of KCl:Ag(0)

    Get PDF
    The equilibrium geometry of Ag0 centers formed at cation sites in KCl has been investigated by means of total-energy calculations carried out on clusters of different sizes. Two distinct methods have been employed: First, an ab initio wave-function based method on embedded clusters and second, density-functional theory ~DFT! methods on clusters in vacuo involving up to 117 atoms. In the ab initio calculations the obtained equilibrium Ag0 -Cl2 distance Re is 3.70 Ã…, implying a large outward relaxation of 18%, along with 7% relaxation for the distance between Ag0 and the first K1 ions in ^100& directions. A very similar result is reached through DFT with a 39-atom cluster. Both approaches lead to a rather shallow minimum of the total-energy surface, the associated force constant of the A1g mode is several times smaller than that found for other impurities in halides. These conclusions are shown to be compatible with available experimental results. The shallow minimum is not clearly seen in DFT calculations with larger clusters. The unpaired electron density on silver and Cl ligands has been calculated as function of the metal-ligand distance and has been compared with values derived from electron-paramagnetic resonance data. The DFT calculations for all cluster sizes indicate that the experimental hyperfine and superhyperfine constants are compatible when Re is close to 3.70 Ã…. The important relation between the electronic stability of a neutral atom inside an ionic lattice and the local relaxation is established through a simple electrostatic model. As most remarkable features it is shown that ~i! the cationic Ag0 center is not likely to be formed inside AgCl, ~ii! in the Ag0 center encountered in SrCl2, the silver atom is probably located at an anion site, and ~iii! the properties of a center-like KCl:Ag0 would experience significant changes under hydrostatic pressures of the order of 6 GPa

    Blind Suppression of Nonstationary Diffuse Acoustic Noise Based on Spatial Covariance Matrix Decomposition

    No full text
    International audienceWe propose methods for blind suppression of nonstationary diffuse noise based on decomposition of the observed spatial covariance matrix into signal and noise parts. In modeling noise to regularize the ill-posed decomposition problem, we exploit spatial invariance (isotropy) instead of temporal invariance (stationarity). The isotropy assumption is that the spatial cross-spectrum of noise is dependent on the distance between microphones and independent of the direction between them. We propose methods for spatial covariance matrix decomposition based on least squares and maximum likelihood estimation. The methods are validated on real-world recordings
    corecore