138 research outputs found
Multi-modal Blind Source Separation with Microphones and Blinkies
We propose a blind source separation algorithm that jointly exploits
measurements by a conventional microphone array and an ad hoc array of low-rate
sound power sensors called blinkies. While providing less information than
microphones, blinkies circumvent some difficulties of microphone arrays in
terms of manufacturing, synchronization, and deployment. The algorithm is
derived from a joint probabilistic model of the microphone and sound power
measurements. We assume the separated sources to follow a time-varying
spherical Gaussian distribution, and the non-negative power measurement
space-time matrix to have a low-rank structure. We show that alternating
updates similar to those of independent vector analysis and Itakura-Saito
non-negative matrix factorization decrease the negative log-likelihood of the
joint distribution. The proposed algorithm is validated via numerical
experiments. Its median separation performance is found to be up to 8 dB more
than that of independent vector analysis, with significantly reduced
variability.Comment: Accepted at IEEE ICASSP 2019, Brighton, UK. 5 pages. 3 figure
Inverse-free Online Independent Vector Analysis with Flexible Iterative Source Steering
In this paper, we propose a new online independent vector analysis (IVA)
algorithm for real-time blind source separation (BSS). In many BSS algorithms,
the iterative projection (IP) has been used for updating the demixing matrix, a
parameter to be estimated in BSS. However, it requires matrix inversion, which
can be costly, particularly in online processing. To improve this situation, we
introduce iterative source steering (ISS) to online IVA. ISS does not require
any matrix inversions, and thus its computational complexity is less than that
of IP. Furthermore, when only part of the sources are moving, ISS enables us to
update the demixing matrix flexibly and effectively so that the steering
vectors of only the moving sources are updated. Numerical experiments under a
dynamic condition confirm the efficacy of the proposed method.Comment: 5 pages, 2 figures. Submitted to APSIPA 202
Robust estimation of directions-of-arrival in diffuse noise based on matrix-space sparsity
We consider the estimation of the Directions-Of-Arrival (DOA) of target signals in diffuse noise. The state-of-the-art MUltiple SIgnal Classification (MUSIC) algorithm necessitates accurate identification of the signal subspace. In diffuse noise, however, it is difficult to identify it directly from the observed spatial covariance matrix. In our approach, we estimate the target spatial covariance matrix, so that we can identify the orthogonal complement of the signal subspace as its null space. We present a unified framework for modeling noise covariance in a matrix space, which generalizes four state-of-the-art diffuse noise models. We propose two alternative algorithms for estimating the target spatial covariance matrix, namely Low-rank Matrix Completion (LMC) and Trace Norm Minimization (TNM). These rely on denoising of the observed spatial covariance matrix via orthogonal projection onto the orthogonal complement of the noise matrix subspace. The missing component lying in the noise matrix subspace is then completed by exploiting the low-rankness of the target spatial covariance matrix. Large-scale experiments with real-world noise show that TNM with a certain noise model outperforms conventional MUSIC based on Generalized EigenValue Decomposition (GEVD) by 5% in terms of the precision averaged over the dataset
Signal Reconstruction from Mel-spectrogram Based on Bi-level Consistency of Full-band Magnitude and Phase
We propose an optimization-based method for reconstructing a time-domain
signal from a low-dimensional spectral representation such as a
mel-spectrogram. Phase reconstruction has been studied to reconstruct a
time-domain signal from the full-band short-time Fourier transform (STFT)
magnitude. The Griffin-Lim algorithm (GLA) has been widely used because it
relies only on the redundancy of STFT and is applicable to various audio
signals. In this paper, we jointly reconstruct the full-band magnitude and
phase by considering the bi-level relationships among the time-domain signal,
its STFT coefficients, and its mel-spectrogram. The proposed method is
formulated as a rigorous optimization problem and estimates the full-band
magnitude based on the criterion used in GLA. Our experiments demonstrate the
effectiveness of the proposed method on speech, music, and environmental
signals.Comment: Accepted to IEEE WASPAA 202
- …