162 research outputs found
A Note on Optimizing Distributions using Kernel Mean Embeddings
Kernel mean embeddings are a popular tool that consists in representing
probability measures by their infinite-dimensional mean embeddings in a
reproducing kernel Hilbert space. When the kernel is characteristic, mean
embeddings can be used to define a distance between probability measures, known
as the maximum mean discrepancy (MMD). A well-known advantage of mean
embeddings and MMD is their low computational cost and low sample complexity.
However, kernel mean embeddings have had limited applications to problems that
consist in optimizing distributions, due to the difficulty of characterizing
which Hilbert space vectors correspond to a probability distribution. In this
note, we propose to leverage the kernel sums-of-squares parameterization of
positive functions of Marteau-Ferey et al. [2020] to fit distributions in the
MMD geometry. First, we show that when the kernel is characteristic,
distributions with a kernel sum-of-squares density are dense. Then, we provide
algorithms to optimize such distributions in the finite-sample setting, which
we illustrate in a density fitting numerical experiment
Apport des vents altimétriques GEOSAT à la détermination des champs de vents dans l'Atlantique tropical
Missing Data Imputation using Optimal Transport
Missing data is a crucial issue when applying machine learning algorithms to
real-world datasets. Starting from the simple assumption that two batches
extracted randomly from the same dataset should share the same distribution, we
leverage optimal transport distances to quantify that criterion and turn it
into a loss function to impute missing data values. We propose practical
methods to minimize these losses using end-to-end learning, that can exploit or
not parametric assumptions on the underlying distributions of values. We
evaluate our methods on datasets from the UCI repository, in MCAR, MAR and MNAR
settings. These experiments show that OT-based methods match or out-perform
state-of-the-art imputation methods, even for high percentages of missing
values
Digital Business Models
This book provides an overview of how digital players create, exchange and capture value thanks to digital technologies. It describes the key characteristics of various digital business models using different business archetypes. Each chapter is illustrated with examples or mini-case studies and also comprises a toolbox describing strategic tools, canvases and frameworks that help managers analyse a situation and formulate proactive solutions
Gradient strikes back: How filtering out high frequencies improves explanations
Recent years have witnessed an explosion in the development of novel
prediction-based attribution methods, which have slowly been supplanting older
gradient-based methods to explain the decisions of deep neural networks.
However, it is still not clear why prediction-based methods outperform
gradient-based ones. Here, we start with an empirical observation: these two
approaches yield attribution maps with very different power spectra, with
gradient-based methods revealing more high-frequency content than
prediction-based methods. This observation raises multiple questions: What is
the source of this high-frequency information, and does it truly reflect
decisions made by the system? Lastly, why would the absence of high-frequency
information in prediction-based methods yield better explainability scores
along multiple metrics? We analyze the gradient of three representative visual
classification models and observe that it contains noisy information emanating
from high-frequencies. Furthermore, our analysis reveals that the operations
used in Convolutional Neural Networks (CNNs) for downsampling appear to be a
significant source of this high-frequency content -- suggesting aliasing as a
possible underlying basis. We then apply an optimal low-pass filter for
attribution maps and demonstrate that it improves gradient-based attribution
methods. We show that (i) removing high-frequency noise yields significant
improvements in the explainability scores obtained with gradient-based methods
across multiple models -- leading to (ii) a novel ranking of state-of-the-art
methods with gradient-based methods at the top. We believe that our results
will spur renewed interest in simpler and computationally more efficient
gradient-based methods for explainability
Entropic Optimal Transport between Unbalanced Gaussian Measures has a Closed Form
International audienceAlthough optimal transport (OT) problems admit closed form solutions in a very few notable cases, e.g. in 1D or between Gaussians, these closed forms have proved extremely fecund for practitioners to define tools inspired from the OT geometry. On the other hand, the numerical resolution of OT problems using entropic regularization has given rise to many applications, but because there are no known closed-form solutions for entropic regularized OT problems, these approaches are mostly algorithmic, not informed by elegant closed forms. In this paper, we propose to fill the void at the intersection between these two schools of thought in OT by proving that the entropy-regularized optimal transport problem between two Gaussian measures admits a closed form. Contrary to the unregularized case, for which the explicit form is given by the Wasserstein-Bures distance, the closed form we obtain is differentiable everywhere, even for Gaussians with degenerate covariance matrices. We obtain this closed form solution by solving the fixed-point equation behind Sinkhorn's algorithm, the default method for computing entropic regularized OT. Remarkably, this approach extends to the generalized unbalanced case-where Gaussian measures are scaled by positive constants. This extension leads to a closed form expression for unbalanced Gaussians as well, and highlights the mass transportation / destruction trade-off seen in unbalanced optimal transport. Moreover, in both settings, we show that the optimal transportation plans are (scaled) Gaussians and provide analytical formulas of their parameters. These formulas constitute the first non-trivial closed forms for entropy-regularized optimal transport, thus providing a ground truth for the analysis of entropic OT and Sinkhorn's algorithm
- …