3 research outputs found
Constrained Reweighting of Distributions: an Optimal Transport Approach
We commonly encounter the problem of identifying an optimally weight adjusted
version of the empirical distribution of observed data, adhering to predefined
constraints on the weights. Such constraints often manifest as restrictions on
the moments, tail behaviour, shapes, number of modes, etc., of the resulting
weight adjusted empirical distribution. In this article, we substantially
enhance the flexibility of such methodology by introducing a nonparametrically
imbued distributional constraints on the weights, and developing a general
framework leveraging the maximum entropy principle and tools from optimal
transport. The key idea is to ensure that the maximum entropy weight adjusted
empirical distribution of the observed data is close to a pre-specified
probability distribution in terms of the optimal transport metric while
allowing for subtle departures. The versatility of the framework is
demonstrated in the context of three disparate applications where data
re-weighting is warranted to satisfy side constraints on the optimization
problem at the heart of the statistical task: namely, portfolio allocation,
semi-parametric inference for complex surveys, and ensuring algorithmic
fairness in machine learning algorithms.Comment: arXiv admin note: text overlap with arXiv:2303.1008
A survey of Identification and mitigation of Machine Learning algorithmic biases in Image Analysis
The problem of algorithmic bias in machine learning has gained a lot of
attention in recent years due to its concrete and potentially hazardous
implications in society. In much the same manner, biases can also alter modern
industrial and safety-critical applications where machine learning are based on
high dimensional inputs such as images. This issue has however been mostly left
out of the spotlight in the machine learning literature. Contrarily to societal
applications where a set of proxy variables can be provided by the common sense
or by regulations to draw the attention on potential risks, industrial and
safety-critical applications are most of the times sailing blind. The variables
related to undesired biases can indeed be indirectly represented in the input
data, or can be unknown, thus making them harder to tackle. This raises serious
and well-founded concerns towards the commercial deployment of AI-based
solutions, especially in a context where new regulations clearly address the
issues opened by undesired biases in AI. Consequently, we propose here to make
an overview of recent advances in this area, firstly by presenting how such
biases can demonstrate themselves, then by exploring different ways to bring
them to light, and by probing different possibilities to mitigate them. We
finally present a practical remote sensing use-case of industrial Fairness