Search CORE

997 research outputs found

Blind Source Separation with Optimal Transport Non-negative Matrix Factorization

Author: Blondel Mathieu
Rolet Antoine
Sawada Hiroshi
Seguy Vivien
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/02/2018
Field of study

Optimal transport as a loss for machine learning optimization problems has recently gained a lot of attention. Building upon recent advances in computational optimal transport, we develop an optimal transport non-negative matrix factorization (NMF) algorithm for supervised speech blind source separation (BSS). Optimal transport allows us to design and leverage a cost between short-time Fourier transform (STFT) spectrogram frequencies, which takes into account how humans perceive sound. We give empirical evidence that using our proposed optimal transport NMF leads to perceptually better results than Euclidean NMF, for both isolated voice reconstruction and BSS tasks. Finally, we demonstrate how to use optimal transport for cross domain sound processing tasks, where frequencies represented in the input spectrograms may be different from one spectrogram to another.Comment: 22 pages, 7 figures, 2 additional file

arXiv.org e-Print Archive

Directory of Open Access Journals

Learning with a Wasserstein loss

Author: Araya-Polo Mauricio
Frogner Charles Albert
Mobahi Hossein
Poggio Tomaso A
Zhang Chiyuan
Publication venue: 'MIT Press - Journals'
Publication date: 17/11/2017
Field of study

Learning to predict multi-label outputs is challenging, but in many problems there is a natural metric on the outputs that can be used to improve predictions.In this paper we develop a loss function for multi-label learning, based on the Wasserstein distance. The Wasserstein distance provides a natural notion of dissimilarity for probability measures. Although optimizing with respect to the exact Wasserstein distance is costly, recent work has described a regularized approximation that is efficiently computed. We describe an efficient learning algorithm based on this regularization, as well as a novel extension of the Wasserstein distance from probability measures to unnormalized measures. We also describe a statistical learning bound for the loss. The Wasserstein loss can encourage smoothness of the predictions with respect to a chosen metric on the output space. We demonstrate this property on a real-data tag prediction problem, using the Yahoo Flickr Creative Commons dataset, outperforming a baseline that doesn't use the metric

DSpace@MIT