997 research outputs found
Blind Source Separation with Optimal Transport Non-negative Matrix Factorization
Optimal transport as a loss for machine learning optimization problems has
recently gained a lot of attention. Building upon recent advances in
computational optimal transport, we develop an optimal transport non-negative
matrix factorization (NMF) algorithm for supervised speech blind source
separation (BSS). Optimal transport allows us to design and leverage a cost
between short-time Fourier transform (STFT) spectrogram frequencies, which
takes into account how humans perceive sound. We give empirical evidence that
using our proposed optimal transport NMF leads to perceptually better results
than Euclidean NMF, for both isolated voice reconstruction and BSS tasks.
Finally, we demonstrate how to use optimal transport for cross domain sound
processing tasks, where frequencies represented in the input spectrograms may
be different from one spectrogram to another.Comment: 22 pages, 7 figures, 2 additional file
Learning with a Wasserstein loss
Learning to predict multi-label outputs is challenging, but in many problems there is a natural metric on the outputs that can be used to improve predictions.In this paper we develop a loss function for multi-label learning, based on the Wasserstein distance. The Wasserstein distance provides a natural notion of dissimilarity for probability measures. Although optimizing with respect to the exact Wasserstein distance is costly, recent work has described a regularized approximation that is efficiently computed. We describe an efficient learning algorithm based on this regularization, as well as a novel extension of the Wasserstein distance from probability measures to unnormalized measures. We also describe a statistical learning bound for the loss. The Wasserstein loss can encourage smoothness of the predictions with respect to a chosen metric on the output space. We demonstrate this property on a real-data tag prediction problem, using the Yahoo Flickr Creative Commons dataset, outperforming a baseline that doesn't use the metric
- …