8 research outputs found
On the Fairness ROAD: Robust Optimization for Adversarial Debiasing
In the field of algorithmic fairness, significant attention has been put on
group fairness criteria, such as Demographic Parity and Equalized Odds.
Nevertheless, these objectives, measured as global averages, have raised
concerns about persistent local disparities between sensitive groups. In this
work, we address the problem of local fairness, which ensures that the
predictor is unbiased not only in terms of expectations over the whole
population, but also within any subregion of the feature space, unknown at
training time. To enforce this objective, we introduce ROAD, a novel approach
that leverages the Distributionally Robust Optimization (DRO) framework within
a fair adversarial learning objective, where an adversary tries to infer the
sensitive attribute from the predictions. Using an instance-level re-weighting
strategy, ROAD is designed to prioritize inputs that are likely to be locally
unfair, i.e. where the adversary faces the least difficulty in reconstructing
the sensitive attribute. Numerical experiments demonstrate the effectiveness of
our method: it achieves Pareto dominance with respect to local fairness and
accuracy for a given global fairness level across three standard datasets, and
also enhances fairness generalization under distribution shift.Comment: 23 pages, 10 figure
Réduire les biais indésirables en apprentissage automatique par atténuation adverse
Ces derniĂšres annĂ©es, on a assistĂ© Ă une augmentation spectaculaire de lâintĂ©rĂȘt acadĂ©mique et sociĂ©tal pour lâapprentissage automatique Ă©quitable. En consĂ©quence, des travaux significatifs ont Ă©tĂ© rĂ©alisĂ©s pour inclure des contraintes dâĂ©quitĂ© dans les algorithmes dâapprentissage automatique. Le but principal est de sâassurer que les prĂ©dictions des modĂšles ne dĂ©pendent dâaucun attribut sensible comme le genre ou lâorigine dâune personne par exemple. Bien que cette notion dâindĂ©pendance soit incontestable dans un contexte gĂ©nĂ©ral, elle peut thĂ©oriquement ĂȘtre dĂ©finie de maniĂšre totalement diffĂ©rente selon la façon dont on voit lâĂ©quitĂ©. Par consĂ©quent, de nombreux articles rĂ©cents abordent ce dĂ©fi en utilisant leurs "propres" objectifs et notions dâĂ©quitĂ©. Les objectifs sont catĂ©gorisĂ©s en deux familles diffĂ©rentes : LâĂ©quitĂ© individuelle et lâĂ©quitĂ© de groupe. Cette thĂšse donne dâune part, une vue dâensemble des mĂ©thodologies appliquĂ©es dans ces diffĂ©rentes familles afin dâencourager les bonnes pratiques. Ensuite, nous identifions et complĂ©tons les lacunes en prĂ©sentant de nouvelles mĂ©triques et des algorithmes de machine learning Ă©quitables qui sont plus appropriĂ©s pour des contextes spĂ©cifiques.The past few years have seen a dramatic rise of academic and societal interest in fair machine learning. As a result, significant work has been done to include fairness constraints in the training objective of machine learning algorithms. Its primary purpose is to ensure that model predictions do not depend on any sensitive attribute as gender or race, for example. Although this notion of independence is incontestable in a general context, it can theoretically be defined in many different ways depending on how one sees fairness. As a result, many recent papers tackle this challenge by using their "own" objectives and notions of fairness. Objectives can be categorized in two different families: Individual and Group fairness. This thesis gives an overview of the methodologies applied in these different families in order to encourage good practices. Then, we identify and complete gaps by presenting new metrics and new Fair-ML algorithms that are more appropriate for specific contexts
Réduire les biais indésirables en apprentissage automatique par atténuation adverse
The past few years have seen a dramatic rise of academic and societal interest in fair machine learning. As a result, significant work has been done to include fairness constraints in the training objective of machine learning algorithms. Its primary purpose is to ensure that model predictions do not depend on any sensitive attribute as gender or race, for example. Although this notion of independence is incontestable in a general context, it can theoretically be defined in many different ways depending on how one sees fairness. As a result, many recent papers tackle this challenge by using their "own" objectives and notions of fairness. Objectives can be categorized in two different families: Individual and Group fairness. This thesis gives an overview of the methodologies applied in these different families in order to encourage good practices. Then, we identify and complete gaps by presenting new metrics and new Fair-ML algorithms that are more appropriate for specific contexts.Ces derniĂšres annĂ©es, on a assistĂ© Ă une augmentation spectaculaire de lâintĂ©rĂȘt acadĂ©mique et sociĂ©tal pour lâapprentissage automatique Ă©quitable. En consĂ©quence, des travaux significatifs ont Ă©tĂ© rĂ©alisĂ©s pour inclure des contraintes dâĂ©quitĂ© dans les algorithmes dâapprentissage automatique. Le but principal est de sâassurer que les prĂ©dictions des modĂšles ne dĂ©pendent dâaucun attribut sensible comme le genre ou lâorigine dâune personne par exemple. Bien que cette notion dâindĂ©pendance soit incontestable dans un contexte gĂ©nĂ©ral, elle peut thĂ©oriquement ĂȘtre dĂ©finie de maniĂšre totalement diffĂ©rente selon la façon dont on voit lâĂ©quitĂ©. Par consĂ©quent, de nombreux articles rĂ©cents abordent ce dĂ©fi en utilisant leurs "propres" objectifs et notions dâĂ©quitĂ©. Les objectifs sont catĂ©gorisĂ©s en deux familles diffĂ©rentes : LâĂ©quitĂ© individuelle et lâĂ©quitĂ© de groupe. Cette thĂšse donne dâune part, une vue dâensemble des mĂ©thodologies appliquĂ©es dans ces diffĂ©rentes familles afin dâencourager les bonnes pratiques. Ensuite, nous identifions et complĂ©tons les lacunes en prĂ©sentant de nouvelles mĂ©triques et des algorithmes de machine learning Ă©quitables qui sont plus appropriĂ©s pour des contextes spĂ©cifiques
A fair pricing model via adversarial learning
At the core of insurance business lies classification between risky and
non-risky insureds, actuarial fairness meaning that risky insureds should
contribute more and pay a higher premium than non-risky or less-risky ones.
Actuaries, therefore, use econometric or machine learning techniques to
classify, but the distinction between a fair actuarial classification and
"discrimination" is subtle. For this reason, there is a growing interest about
fairness and discrimination in the actuarial community Lindholm, Richman,
Tsanakas, and Wuthrich (2022). Presumably, non-sensitive characteristics can
serve as substitutes or proxies for protected attributes. For example, the
color and model of a car, combined with the driver's occupation, may lead to an
undesirable gender bias in the prediction of car insurance prices.
Surprisingly, we will show that debiasing the predictor alone may be
insufficient to maintain adequate accuracy (1). Indeed, the traditional pricing
model is currently built in a two-stage structure that considers many
potentially biased components such as car or geographic risks. We will show
that this traditional structure has significant limitations in achieving
fairness. For this reason, we have developed a novel pricing model approach.
Recently some approaches have Blier-Wong, Cossette, Lamontagne, and Marceau
(2021); Wuthrich and Merz (2021) shown the value of autoencoders in pricing. In
this paper, we will show that (2) this can be generalized to multiple pricing
factors (geographic, car type), (3) it perfectly adapted for a fairness context
(since it allows to debias the set of pricing components): We extend this main
idea to a general framework in which a single whole pricing model is trained by
generating the geographic and car pricing components needed to predict the pure
premium while mitigating the unwanted bias according to the desired metric.Comment: 20 pages, 12 figure
Joint modeling of claim frequencies and behavioral signals in motor insurance
Telematics devices installed in insured vehicles provide actuaries with new risk factors, such as the time of the day, average speeds, and other driving habits. This paper extends the multivariate mixed model describing the joint dynamics of telematics data and claim frequencies proposed by Denuit et al. (2019a) by allowing for signals with various formats, not necessarily integer-valued, and by replacing the estimation procedure with the Expected Conditional Maximization algorithm. A numerical study performed on a database related to Pay-How-You-Drive, or PHYD motor insurance illustrates the relevance of the proposed approach for practice