4,868 research outputs found
DNN-Based Source Enhancement to Increase Objective Sound Quality Assessment Score
We propose a training method for deep neural network (DNN)-based source enhancement to increase objective sound quality assessment (OSQA) scores such as the perceptual evaluation of speech quality (PESQ). In many conventional studies, DNNs have been used as a mapping function to estimate time-frequency masks and trained to minimize an analytically tractable objective function such as the mean squared error (MSE). Since OSQA scores have been used widely for soundquality evaluation, constructing DNNs to increase OSQA scores would be better than using the minimum-MSE to create highquality output signals. However, since most OSQA scores are not analytically tractable, i.e., they are black boxes, the gradient of the objective function cannot be calculated by simply applying back-propagation. To calculate the gradient of the OSQA-based objective function, we formulated a DNN optimization scheme on the basis of black-box optimization, which is used for training a computer that plays a game. For a black-box-optimization scheme, we adopt the policy gradient method for calculating the gradient on the basis of a sampling algorithm. To simulate output signals using the sampling algorithm, DNNs are used to estimate the probability-density function of the output signals that maximize OSQA scores. The OSQA scores are calculated from the simulated output signals, and the DNNs are trained to increase the probability of generating the simulated output signals that achieve high OSQA scores. Through several experiments, we found that OSQA scores significantly increased by applying the proposed method, even though the MSE was not minimized
Acoustic echo and noise canceller for personal hands-free video IP phone
This paper presents implementation and evaluation of a proposed acoustic echo and noise canceller (AENC) for videotelephony-enabled personal hands-free Internet protocol (IP) phones. This canceller has the following features: noise-robust performance, low processing delay, and low computational complexity. The AENC employs an adaptive digital filter (ADF) and noise reduction (NR) methods that can effectively eliminate undesired acoustic echo and background noise included in a microphone signal even in a noisy environment. The ADF method uses the step-size control approach according to the level of disturbance such as background noise; it can minimize the effect of disturbance in a noisy environment. The NR method estimates the noise level under an assumption that the noise amplitude spectrum is constant in a short period, which cannot be applied to the amplitude spectrum of speech. In addition, this paper presents the method for decreasing the computational complexity of the ADF process without increasing the processing delay to make the processing suitable for real-time implementation. The experimental results demonstrate that the proposed AENC suppresses echo and noise sufficiently in a noisy environment; thus, resulting in natural-sounding speech
Ab initio study on the magneto-structural properties of MnAs
The magnetic and structural properties of MnAs are studied with ab initio
methods, and by mapping total energies onto a Heisenberg model. The stability
of the different phases is found to depend mainly on the volume and on the
amount of magnetic order, confirming previous experimental findings and
phenomenological models. It is generally found that for large lattice constants
the ferromagnetic state is favored, whereas for small lattice constants
different antiferromagnetic states can be stabilized. In the ferromagnetic
state the structure with minimal energy is always hexagonal, whereas it becomes
orthorhombically distorted if there is an antiferromagnetic component in the
hexagonal plane. For the paramagnetic state the stable cell is found to be
orthorhombic up to a critical lattice constant of about 3.7 Angstrom, above
which it remains hexagonal. This leads to the second order structural phase
transition between paramagnetic states at about 400 K, where the lattice
parameter increases above this critical value with rising temperature due to
the thermal expansion. For the paramagnetic state an analytic approximation for
the magnitude of the orthorhombic distortion as a function of the lattice
constant is given. Within the mean field approximation the dependence of the
Curie temperature on the volume and on the orthorhombic distortion is
calculated. For orthorhombically distorted cells the Curie temperature is much
smaller than for hexagonal cells. This is mainly due to the fact that some of
the exchange coupling constants in the hexagonal plane become negative for
distorted cells. With these results a description of the susceptibility as
function of temperature is given
音の波数領域信号処理 —平面波・円調和・球面調和関数展開とアレー信号処理—
マイクロホン素子やスピーカ素子を複数個並べたアレー信号処理は,音の空間情報を扱えるため,様々な応用において重要な役割を果たしている.本稿では,直線上,円周上,球面上に配置された素子位置に対する空間フーリエ変換の基礎について,波動方程式の解から出発して概説する.また,波面合成や指向性制御といった応用を波数領域で行う方法についても合わせて説明するArray signal processing with multiple microphones or loudspeakers is important for various applications because it can use spatial information. This article explains the fundamentals of the spatial Fourier transforms for the linear array, circular array, and spherical array from the viewpoint of the wave equation. Array processing methods in the wavenumber domain for wave field synthesis and directivity control are also described
- …
