44 research outputs found
Eliminating Latent Discrimination: Train Then Mask
How can we control for latent discrimination in predictive models? How can we
provably remove it? Such questions are at the heart of algorithmic fairness and
its impacts on society. In this paper, we define a new operational fairness
criteria, inspired by the well-understood notion of omitted variable-bias in
statistics and econometrics. Our notion of fairness effectively controls for
sensitive features and provides diagnostics for deviations from fair decision
making. We then establish analytical and algorithmic results about the
existence of a fair classifier in the context of supervised learning. Our
results readily imply a simple, but rather counter-intuitive, strategy for
eliminating latent discrimination. In order to prevent other features proxying
for sensitive features, we need to include sensitive features in the training
phase, but exclude them in the test/evaluation phase while controlling for
their effects. We evaluate the performance of our algorithm on several
real-world datasets and show how fairness for these datasets can be improved
with a very small loss in accuracy
Achieving Long-term Fairness in Submodular Maximization through Randomization
Submodular function optimization has numerous applications in machine
learning and data analysis, including data summarization which aims to identify
a concise and diverse set of data points from a large dataset. It is important
to implement fairness-aware algorithms when dealing with data items that may
contain sensitive attributes like race or gender, to prevent biases that could
lead to unequal representation of different groups. With this in mind, we
investigate the problem of maximizing a monotone submodular function while
meeting group fairness constraints. Unlike previous studies in this area, we
allow for randomized solutions, with the objective being to calculate a
distribution over feasible sets such that the expected number of items selected
from each group is subject to constraints in the form of upper and lower
thresholds, ensuring that the representation of each group remains balanced in
the long term. Here a set is considered feasible if its size does not exceed a
constant value of . Our research includes the development of a series of
approximation algorithms for this problem.Comment: This paper has been accepted to 19th Cologne-Twente Workshop on
Graphs and Combinatorial Optimizatio
Variational Fair Clustering
We propose a general variational framework of fair clustering, which
integrates an original Kullback-Leibler (KL) fairness term with a large class
of clustering objectives, including prototype or graph based. Fundamentally
different from the existing combinatorial and spectral solutions, our
variational multi-term approach enables to control the trade-off levels between
the fairness and clustering objectives. We derive a general tight upper bound
based on a concave-convex decomposition of our fairness term, its
Lipschitz-gradient property and the Pinsker's inequality. Our tight upper bound
can be jointly optimized with various clustering objectives, while yielding a
scalable solution, with convergence guarantee. Interestingly, at each
iteration, it performs an independent update for each assignment variable.
Therefore, it can be easily distributed for large-scale datasets. This
scalability is important as it enables to explore different trade-off levels
between the fairness and clustering objectives. Unlike spectral relaxation, our
formulation does not require computing its eigenvalue decomposition. We report
comprehensive evaluations and comparisons with state-of-the-art methods over
various fair-clustering benchmarks, which show that our variational formulation
can yield highly competitive solutions in terms of fairness and clustering
objectives.Comment: Accepted to be published in AAAI 2021. The Code is available at:
https://github.com/imtiazziko/Variational-Fair-Clusterin