85 research outputs found
Reverberation time estimation on the ACE corpus using the SDD method
Reverberation Time (T60) is an important measure for characterizing the
properties of a room. The author's T60 estimation algorithm was previously
tested on simulated data where the noise is artificially added to the speech
after convolution with a impulse responses simulated using the image method. We
test the algorithm on speech convolved with real recorded impulse responses and
noise from the same rooms from the Acoustic Characterization of Environments
(ACE) corpus and achieve results comparable results to those using simulated
data.Comment: In Proceedings of the ACE Challenge Workshop - a satellite event of
IEEE-WASPAA 2015 (arXiv:1510.00383
Multi-scale Multi-band DenseNets for Audio Source Separation
This paper deals with the problem of audio source separation. To handle the
complex and ill-posed nature of the problems of audio source separation, the
current state-of-the-art approaches employ deep neural networks to obtain
instrumental spectra from a mixture. In this study, we propose a novel network
architecture that extends the recently developed densely connected
convolutional network (DenseNet), which has shown excellent results on image
classification tasks. To deal with the specific problem of audio source
separation, an up-sampling layer, block skip connection and band-dedicated
dense blocks are incorporated on top of DenseNet. The proposed approach takes
advantage of long contextual information and outperforms state-of-the-art
results on SiSEC 2016 competition by a large margin in terms of
signal-to-distortion ratio. Moreover, the proposed architecture requires
significantly fewer parameters and considerably less training time compared
with other methods.Comment: to appear at WASPAA 201
PSD Estimation of Multiple Sound Sources in a Reverberant Room Using a Spherical Microphone Array
We propose an efficient method to estimate source power spectral densities
(PSDs) in a multi-source reverberant environment using a spherical microphone
array. The proposed method utilizes the spatial correlation between the
spherical harmonics (SH) coefficients of a sound field to estimate source PSDs.
The use of the spatial cross-correlation of the SH coefficients allows us to
employ the method in an environment with a higher number of sources compared to
conventional methods. Furthermore, the orthogonality property of the SH basis
functions saves the effort of designing specific beampatterns of a conventional
beamformer-based method. We evaluate the performance of the algorithm with
different number of sources in practical reverberant and non-reverberant rooms.
We also demonstrate an application of the method by separating source signals
using a conventional beamformer and a Wiener post-filter designed from the
estimated PSDs.Comment: Accepted for WASPAA 201
- …