Search CORE

Scientific Publications of the University of Toulouse II Le Mirail

Convex nonnegative matrix factorization with missing data

Author: Emiya Valentin
Févotte Cédric
Hamon Ronan
Publication venue: HAL CCSD
Publication date: 01/01/2016
Field of study

International audienceConvex nonnegative matrix factorization (CNMF) is a variant of nonnegative matrix factorization (NMF) in which the components are a convex combination of atoms of a known dictionary. In this contribution, we propose to extend CNMF to the case where the data matrix and the dictionary have missing entries. After a formulation of the problem in this context of missing data, we propose a majorization-minimization algorithm for the solving of the optimization problem incurred. Experimental results with synthetic data and audio spectrograms highlight an improvement of the performance of reconstruction with respect to standard NMF. The performance gap is particularly significant when the task of reconstruction becomes arduous, e.g. when the ratio of missing data is high, the noise is steep, or the complexity of data is high

Crossref

HAL-UNICE

Open Archive Toulouse Archive Ouverte

HAL Descartes

HAL-INSU

An investigation of discrete-state discriminant approaches to single-sensor source separation

Author: Emiya Valentin
Gribonval Rémi
Vincent Emmanuel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/10/2009
Field of study

International audienceThis paper investigated a new scheme for single-sensor audio source separation. This framework is introduced comparatively to the existing Gaussian mixture model generative approach and is focusing on the mixture states rather than on the source states, resulting in a discrete, joint state discriminant approach. The study establishes the theoretical performance bounds of the proposed scheme and an actual source separation system is designed. The performance is computed on a set of musical recordings and a discussion is proposed, including the question of the source correlation and the possible drawbacks of the method

Crossref

HAL Descartes

Hal-Diderot

arXiv.org e-Print Archive

QuicK-means: Acceleration of K-means by learning a fast transform

Author: Emiya Valentin
Giffon Luc
Kadri Hachem
Ralaivola Liva
Publication venue: HAL CCSD
Publication date: 27/11/2019
Field of study

K-means -- and the celebrated Lloyd algorithm -- is more than the clustering method it was originally designed to be. It has indeed proven pivotal to help increase the speed of many machine learning and data analysis techniques such as indexing, nearest-neighbor search and prediction, data compression, Radial Basis Function networks; its beneficial use has been shown to carry over to the acceleration of kernel machines (when using the Nyström method). Here, we propose a fast extension of K-means, dubbed QuicK-means, that rests on the idea of expressing the matrix of the

K

centroids as a product of sparse matrices, a feat made possible by recent results devoted to find approximations of matrices as a product of sparse factors. Using such a decomposition squashes the complexity of the matrix-vector product between the factorized

K \times D

centroid matrix

\mathbf{U}

and any vector from

\mathcal{O}(K D)

\mathcal{O}(A \log A+B)

, with

A=\min (K, D)

and

B=\max (K, D)

, where

D

is the dimension of the training data. This drastic computational saving has a direct impact in the assignment process of a point to a cluster, meaning that it is not only tangible at prediction time, but also at training time, provided the factorization procedure is performed during Lloyd's algorithm. We precisely show that resorting to a factorization step at each iteration does not impair the convergence of the optimization scheme and that, depending on the context, it may entail a reduction of the training time. Finally, we provide discussions and numerical simulations that show the versatility of our computationally-efficient QuicK-means algorithm

Optimal spectral transportation with application to music transcription

Author: Courty Nicolas
Emiya Valentin
Flamary Rémi
Févotte Cédric
Publication venue: HAL CCSD
Publication date: 01/01/2016
Field of study

International audienceMany spectral unmixing methods rely on the non-negative decomposition of spectral data onto a dictionary of spectral templates. In particular, state-of-the-art music transcription systems decompose the spectrogram of the input signal onto a dictionary of representative note spectra. The typical measures of fit used to quantify the adequacy of the decomposition compare the data and template entries frequency-wise. As such, small displacements of energy from a frequency bin to another as well as variations of timbre can disproportionally harm the fit. We address these issues by means of optimal transportation and propose a new measure of fit that treats the frequency distributions of energy holistically as opposed to frequency-wise. Building on the harmonic nature of sound, the new measure is invariant to shifts of energy to harmonically-related frequencies, as well as to small and local displacements of energy. Equipped with this new measure of fit, the dictionary of note templates can be considerably simplified to a set of Dirac vectors located at the target fundamental frequencies (musical pitch values). This in turns gives ground to a very fast and simple decomposition algorithm that achieves state-of-the-art performance on real musical data. 1 Context Many of nowadays spectral unmixing techniques rely on non-negative matrix decompositions. This concerns for example hyperspectral remote sensing (with applications in Earth observation, astronomy, chemistry, etc.) or audio signal processing. The spectral sample v n (the spectrum of light observed at a given pixel n, or the audio spectrum in a given time frame n) is decomposed onto a dictionary W of elementary spectral templates, characteristic of pure materials or sound objects, such that v n ≈ Wh n. The composition of sample n can be inferred from the non-negative expansion coefficients h n. This paradigm has led to state-of-the-art results for various tasks (recognition, classification, denoising, separation) in the aforementioned areas, and in particular in music transcription, the central application of this paper. In state-of-the-art music transcription systems, the spectrogram V (with columns v n) of a musical signal is decomposed onto a dictionary of pure notes (in so-called multi-pitch estimation) or chords. V typically consists of (power-)magnitude values of a regular short-time Fourier transform (Smaragdis and Brown, 2003). It may also consists of an audio-specific spectral transform such as the Mel-frequency transform, like in (Vincent et al., 2010), or the Q-constant based transform, like in (Oudre et al., 2011). The success of the transcription system depends of course on the adequacy of the time-frequency transform & the dictionary to represent the data V

Scientific Publications of the University of Toulouse II Le Mirail

HAL-UNICE

Open Archive Toulouse Archive Ouverte

HAL-INSU

HAL Descartes

arXiv.org e-Print Archive

Formalizing the Problem of Music Description

Author: Bardeli Rolf
Emiya Valentin
Langlois Thibault
Sturm Bob L.
Publication venue: ISMIR
Publication date: 01/01/2015
Field of study

VBN

Dynamic Screening: Accelerating First-Order Algorithms for the Lasso and Group-Lasso

Author: Bonnefoy Antoine
Emiya Valentin
Gribonval Rémi
Ralaivola Liva
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/12/2014
Field of study

Recent computational strategies based on screening tests have been proposed to accelerate algorithms addressing penalized sparse regression problems such as the Lasso. Such approaches build upon the idea that it is worth dedicating some small computational effort to locate inactive atoms and remove them from the dictionary in a preprocessing stage so that the regression algorithm working with a smaller dictionary will then converge faster to the solution of the initial problem. We believe that there is an even more efficient way to screen the dictionary and obtain a greater acceleration: inside each iteration of the regression algorithm, one may take advantage of the algorithm computations to obtain a new screening test for free with increasing screening effects along the iterations. The dictionary is henceforth dynamically screened instead of being screened statically, once and for all, before the first iteration. We formalize this dynamic screening principle in a general algorithmic scheme and apply it by embedding inside a number of first-order algorithms adapted existing screening tests to solve the Lasso or new screening tests to solve the Group-Lasso. Computational gains are assessed in a large set of experiments on synthetic data as well as real-world sounds and images. They show both the screening efficiency and the gain in terms running times

Crossref

Hal-Diderot

A Dynamic Screening Principle for the Lasso

Author: Bonnefoy Antoine
Emiya Valentin
Gribonval Rémi
Ralaivola Liva
Publication venue: HAL CCSD
Publication date: 01/09/2014
Field of study

International audienceThe Lasso is an optimization problem devoted to finding a sparse representation of some signal with respect to a predefined dictionary. An original and computationally-efficient method is proposed here to solve this problem, based on a dynamic screening principle. It makes it possible to accelerate a large class of optimization algorithms by iteratively reducing the size of the dictionary during the optimization process, discarding elements that are provably known not to belong to the solution of the Lasso. The iterative reduction of the dictionary is what we call dynamic screening. As this screening step is inexpensive, the computational cost of the algorithm using our dynamic screening strategy is lower than that of the base algorithm. Numerical experiments on synthetic and real data support the relevance of this approach