34 research outputs found
The Flawed Foundations of Fair Machine Learning
The definition and implementation of fairness in automated decisions has been
extensively studied by the research community. Yet, there hides fallacious
reasoning, misleading assertions, and questionable practices at the foundations
of the current fair machine learning paradigm. Those flaws are the result of a
failure to understand that the trade-off between statistically accurate
outcomes and group similar outcomes exists as independent, external constraint
rather than as a subjective manifestation as has been commonly argued. First,
we explain that there is only one conception of fairness present in the fair
machine learning literature: group similarity of outcomes based on a sensitive
attribute where the similarity benefits an underprivileged group. Second, we
show that there is, in fact, a trade-off between statistically accurate
outcomes and group similar outcomes in any data setting where group disparities
exist, and that the trade-off presents an existential threat to the equitable,
fair machine learning approach. Third, we introduce a proof-of-concept
evaluation to aid researchers and designers in understanding the relationship
between statistically accurate outcomes and group similar outcomes. Finally,
suggestions for future work aimed at data scientists, legal scholars, and data
ethicists that utilize the conceptual and experimental framework described
throughout this article are provided.Comment: This article is a preprint submitted to the Minds and Machines
Special Issue on the (Un)fairness of AI on May 31st, 202
Fairness for Cooperative Multi-Agent Learning with Equivariant Policies
We study fairness through the lens of cooperative multi-agent learning. Our
work is motivated by empirical evidence that naive maximization of team reward
yields unfair outcomes for individual team members. To address fairness in
multi-agent contexts, we introduce team fairness, a group-based fairness
measure for multi-agent learning. We then incorporate team fairness into policy
optimization -- introducing Fairness through Equivariance (Fair-E), a novel
learning strategy that achieves provably fair reward distributions. We then
introduce Fairness through Equivariance Regularization (Fair-ER) as a
soft-constraint version of Fair-E and show that Fair-ER reaches higher levels
of utility than Fair-E and fairer outcomes than policies with no equivariance.
Finally, we investigate the fairness-utility trade-off in multi-agent settings.Comment: 15 pages, 4 figure
A Theoretical Approach to Characterize the Accuracy-Fairness Trade-off Pareto Frontier
While the accuracy-fairness trade-off has been frequently observed in the
literature of fair machine learning, rigorous theoretical analyses have been
scarce. To demystify this long-standing challenge, this work seeks to develop a
theoretical framework by characterizing the shape of the accuracy-fairness
trade-off Pareto frontier (FairFrontier), determined by a set of all optimal
Pareto classifiers that no other classifiers can dominate. Specifically, we
first demonstrate the existence of the trade-off in real-world scenarios and
then propose four potential categories to characterize the important properties
of the accuracy-fairness Pareto frontier. For each category, we identify the
necessary conditions that lead to corresponding trade-offs. Experimental
results on synthetic data suggest insightful findings of the proposed
framework: (1) When sensitive attributes can be fully interpreted by
non-sensitive attributes, FairFrontier is mostly continuous. (2) Accuracy can
suffer a \textit{sharp} decline when over-pursuing fairness. (3) Eliminate the
trade-off via a two-step streamlined approach. The proposed research enables an
in-depth understanding of the accuracy-fairness trade-off, pushing current fair
machine-learning research to a new frontier
Counterpart Fairness -- Addressing Systematic between-group Differences in Fairness Evaluation
When using machine learning (ML) to aid decision-making, it is critical to
ensure that an algorithmic decision is fair, i.e., it does not discriminate
against specific individuals/groups, particularly those from underprivileged
populations. Existing group fairness methods require equal group-wise measures,
which however fails to consider systematic between-group differences. The
confounding factors, which are non-sensitive variables but manifest systematic
differences, can significantly affect fairness evaluation. To mitigate this
problem, we believe that a fairness measurement should be based on the
comparison between counterparts (i.e., individuals who are similar to each
other with respect to the task of interest) from different groups, whose group
identities cannot be distinguished algorithmically by exploring confounding
factors. We have developed a propensity-score-based method for identifying
counterparts, which prevents fairness evaluation from comparing "oranges" with
"apples". In addition, we propose a counterpart-based statistical fairness
index, termed Counterpart-Fairness (CFair), to assess fairness of ML models.
Empirical studies on the Medical Information Mart for Intensive Care (MIMIC)-IV
database were conducted to validate the effectiveness of CFair. We publish our
code at \url{https://github.com/zhengyjo/CFair}.Comment: 18 pages, 5 figures, 5 table
On The Fairness Impacts of Hardware Selection in Machine Learning
In the machine learning ecosystem, hardware selection is often regarded as a
mere utility, overshadowed by the spotlight on algorithms and data. This
oversight is particularly problematic in contexts like ML-as-a-service
platforms, where users often lack control over the hardware used for model
deployment. How does the choice of hardware impact generalization properties?
This paper investigates the influence of hardware on the delicate balance
between model performance and fairness. We demonstrate that hardware choices
can exacerbate existing disparities, attributing these discrepancies to
variations in gradient flows and loss surfaces across different demographic
groups. Through both theoretical and empirical analysis, the paper not only
identifies the underlying factors but also proposes an effective strategy for
mitigating hardware-induced performance imbalances
Ensuring generalized fairness in batch classification
In this paper, we consider the problem of batch classification and propose a novel framework for achieving fairness in such settings. The problem of batch classification involves selection of a set of individuals, often encountered in real-world scenarios such as job recruitment, college admissions etc. This is in contrast to a typical classification problem, where each candidate in the test set is considered separately and independently. In such scenarios, achieving the same acceptance rate (i.e., probability of the classifier assigning positive class) for each group (membership determined by the value of sensitive attributes such as gender, race etc.) is often not desirable, and the regulatory body specifies a different acceptance rate for each group. The existing fairness enhancing methods do not allow for such specifications and hence are unsuited for such scenarios. In this paper, we define a configuration model whereby the acceptance rate of each group can be regulated and further introduce a novel batch-wise fairness post-processing framework using the classifier confidence-scores. We deploy our framework across four real-world datasets and two popular notions of fairness, namely demographic parity and equalized odds. In addition to consistent performance improvements over the competing baselines, the proposed framework allows flexibility and significant speed-up. It can also seamlessly incorporate multiple overlapping sensitive attributes. To further demonstrate the generalizability of our framework, we deploy it to the problem of fair gerrymandering where it achieves a better fairness-accuracy trade-off than the existing baseline method