Search CORE

162 research outputs found

Wind fields at the sea surface determined from combined ship and satellite altimeter data

Author: Gohin F.
Muzellec A.
Servain Jacques
Publication venue
Publication date: 01/01/1993
Field of study

A Note on Optimizing Distributions using Kernel Mean Embeddings

Author: Bach Francis
Muzellec Boris
Rudi Alessandro
Publication venue
Publication date: 18/06/2021
Field of study

Kernel mean embeddings are a popular tool that consists in representing probability measures by their infinite-dimensional mean embeddings in a reproducing kernel Hilbert space. When the kernel is characteristic, mean embeddings can be used to define a distance between probability measures, known as the maximum mean discrepancy (MMD). A well-known advantage of mean embeddings and MMD is their low computational cost and low sample complexity. However, kernel mean embeddings have had limited applications to problems that consist in optimizing distributions, due to the difficulty of characterizing which Hilbert space vectors correspond to a probability distribution. In this note, we propose to leverage the kernel sums-of-squares parameterization of positive functions of Marteau-Ferey et al. [2020] to fit distributions in the MMD geometry. First, we show that when the kernel is characteristic, distributions with a kernel sum-of-squares density are dense. Then, we provide algorithms to optimize such distributions in the finite-sample setting, which we illustrate in a density fitting numerical experiment

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Apport des vents altimétriques GEOSAT à la détermination des champs de vents dans l'Atlantique tropical

Author: Gohin Francis (collab.)
Muzellec A.
Servain Jacques (dir.)
Publication venue: ORSTOM
Publication date: 01/01/1991
Field of study

Horizon / Pleins textes

Missing Data Imputation using Optimal Transport

Author: Boyer Claire
Cuturi Marco
Josse Julie
Muzellec Boris
Publication venue
Publication date: 01/01/2020
Field of study

Missing data is a crucial issue when applying machine learning algorithms to real-world datasets. Starting from the simple assumption that two batches extracted randomly from the same dataset should share the same distribution, we leverage optimal transport distances to quantify that criterion and turn it into a loss function to impute missing data values. We propose practical methods to minimize these losses using end-to-end learning, that can exploit or not parametric assumptions on the underlying distributions of values. We evaluate our methods on datasets from the UCI repository, in MCAR, MAR and MNAR settings. These experiments show that OT-based methods match or out-perform state-of-the-art imputation methods, even for high percentages of missing values

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Digital Business Models

Author: Muzellec Laurent
Ronteau Sébastien
Saxena Deepak
Trabucchi Daniel
Publication venue: 'Walter de Gruyter GmbH'
Publication date
Field of study

This book provides an overview of how digital players create, exchange and capture value thanks to digital technologies. It describes the key characteristics of various digital business models using different business archetypes. Each chapter is illustrated with examples or mini-case studies and also comprises a toolbox describing strategic tools, canvases and frameworks that help managers analyse a situation and formulate proactive solutions

OAPEN Library

Gradient strikes back: How filtering out high frequencies improves explanations

Author: Andéol Léo
Fel Thomas
Muzellec Sabine
Serre Thomas
VanRullen Rufin
Publication venue
Publication date: 18/07/2023
Field of study

Recent years have witnessed an explosion in the development of novel prediction-based attribution methods, which have slowly been supplanting older gradient-based methods to explain the decisions of deep neural networks. However, it is still not clear why prediction-based methods outperform gradient-based ones. Here, we start with an empirical observation: these two approaches yield attribution maps with very different power spectra, with gradient-based methods revealing more high-frequency content than prediction-based methods. This observation raises multiple questions: What is the source of this high-frequency information, and does it truly reflect decisions made by the system? Lastly, why would the absence of high-frequency information in prediction-based methods yield better explainability scores along multiple metrics? We analyze the gradient of three representative visual classification models and observe that it contains noisy information emanating from high-frequencies. Furthermore, our analysis reveals that the operations used in Convolutional Neural Networks (CNNs) for downsampling appear to be a significant source of this high-frequency content -- suggesting aliasing as a possible underlying basis. We then apply an optimal low-pass filter for attribution maps and demonstrate that it improves gradient-based attribution methods. We show that (i) removing high-frequency noise yields significant improvements in the explainability scores obtained with gradient-based methods across multiple models -- leading to (ii) a novel ranking of state-of-the-art methods with gradient-based methods at the top. We believe that our results will spur renewed interest in simpler and computationally more efficient gradient-based methods for explainability

arXiv.org e-Print Archive

Entropic Optimal Transport between Unbalanced Gaussian Measures has a Closed Form

Author: Cuturi Marco
Janati Hicham
Muzellec Boris
Peyré Gabriel
Publication venue: HAL CCSD
Publication date: 06/12/2020
Field of study

International audienceAlthough optimal transport (OT) problems admit closed form solutions in a very few notable cases, e.g. in 1D or between Gaussians, these closed forms have proved extremely fecund for practitioners to define tools inspired from the OT geometry. On the other hand, the numerical resolution of OT problems using entropic regularization has given rise to many applications, but because there are no known closed-form solutions for entropic regularized OT problems, these approaches are mostly algorithmic, not informed by elegant closed forms. In this paper, we propose to fill the void at the intersection between these two schools of thought in OT by proving that the entropy-regularized optimal transport problem between two Gaussian measures admits a closed form. Contrary to the unregularized case, for which the explicit form is given by the Wasserstein-Bures distance, the closed form we obtain is differentiable everywhere, even for Gaussians with degenerate covariance matrices. We obtain this closed form solution by solving the fixed-point equation behind Sinkhorn's algorithm, the default method for computing entropic regularized OT. Remarkably, this approach extends to the generalized unbalanced case-where Gaussian measures are scaled by positive constants. This extension leads to a closed form expression for unbalanced Gaussians as well, and highlights the mass transportation / destruction trade-off seen in unbalanced optimal transport. Moreover, in both settings, we show that the optimal transportation plans are (scaled) Gaussians and provide analytical formulas of their parameters. These formulas constitute the first non-trivial closed forms for entropy-regularized optimal transport, thus providing a ground truth for the analysis of entropic OT and Sinkhorn's algorithm

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL-CEA