58 research outputs found
FERMI: Fair Empirical Risk Minimization via Exponential R\'enyi Mutual Information
Despite the success of large-scale empirical risk minimization (ERM) at
achieving high accuracy across a variety of machine learning tasks, fair ERM is
hindered by the incompatibility of fairness constraints with stochastic
optimization. In this paper, we propose the fair empirical risk minimization
via exponential R\'enyi mutual information (FERMI) framework. FERMI is built on
a stochastic estimator for exponential R\'enyi mutual information (ERMI), an
information divergence measuring the degree of the dependence of predictions on
sensitive attributes. Theoretically, we show that ERMI upper bounds existing
popular fairness violation metrics, thus controlling ERMI provides guarantees
on other commonly used violations, such as . We derive an unbiased
estimator for ERMI, which we use to derive the FERMI algorithm. We prove that
FERMI converges for demographic parity, equalized odds, and equal opportunity
notions of fairness in stochastic optimization. Empirically, we show that FERMI
is amenable to large-scale problems with multiple (non-binary) sensitive
attributes and non-binary targets. Extensive experiments show that FERMI
achieves the most favorable tradeoffs between fairness violation and test
accuracy across all tested setups compared with state-of-the-art baselines for
demographic parity, equalized odds, equal opportunity. These benefits are
especially significant for non-binary classification with large sensitive sets
and small batch sizes, showcasing the effectiveness of the FERMI objective and
the developed stochastic algorithm for solving it.Comment: 29 page
R\'enyi Fair Inference
Machine learning algorithms have been increasingly deployed in critical
automated decision-making systems that directly affect human lives. When these
algorithms are only trained to minimize the training/test error, they could
suffer from systematic discrimination against individuals based on their
sensitive attributes such as gender or race. Recently, there has been a surge
in machine learning society to develop algorithms for fair machine learning. In
particular, many adversarial learning procedures have been proposed to impose
fairness. Unfortunately, these algorithms either can only impose fairness up to
first-order dependence between the variables, or they lack computational
convergence guarantees. In this paper, we use R\'enyi correlation as a measure
of fairness of machine learning models and develop a general training framework
to impose fairness. In particular, we propose a min-max formulation which
balances the accuracy and fairness when solved to optimality. For the case of
discrete sensitive attributes, we suggest an iterative algorithm with
theoretical convergence guarantee for solving the proposed min-max problem. Our
algorithm and analysis are then specialized to fair classification and the fair
clustering problem under disparate impact doctrine. Finally, the performance of
the proposed R\'enyi fair inference framework is evaluated on Adult and Bank
datasets.Comment: 11 pages, 1 figur
Stochastic Differentially Private and Fair Learning
Machine learning models are increasingly used in high-stakes decision-making
systems. In such applications, a major concern is that these models sometimes
discriminate against certain demographic groups such as individuals with
certain race, gender, or age. Another major concern in these applications is
the violation of the privacy of users. While fair learning algorithms have been
developed to mitigate discrimination issues, these algorithms can still leak
sensitive information, such as individuals' health or financial records.
Utilizing the notion of differential privacy (DP), prior works aimed at
developing learning algorithms that are both private and fair. However,
existing algorithms for DP fair learning are either not guaranteed to converge
or require full batch of data in each iteration of the algorithm to converge.
In this paper, we provide the first stochastic differentially private algorithm
for fair learning that is guaranteed to converge. Here, the term "stochastic"
refers to the fact that our proposed algorithm converges even when minibatches
of data are used at each iteration (i.e. stochastic optimization). Our
framework is flexible enough to permit different fairness notions, including
demographic parity and equalized odds. In addition, our algorithm can be
applied to non-binary classification tasks with multiple (non-binary) sensitive
attributes. As a byproduct of our convergence analysis, we provide the first
utility guarantee for a DP algorithm for solving nonconvex-strongly concave
min-max problems. Our numerical experiments show that the proposed algorithm
consistently offers significant performance gains over the state-of-the-art
baselines, and can be applied to larger scale problems with non-binary
target/sensitive attributes.Comment: ICLR 202
Recent Advances of Differential Privacy in Centralized Deep Learning: A Systematic Survey
Differential Privacy has become a widely popular method for data protection
in machine learning, especially since it allows formulating strict mathematical
privacy guarantees. This survey provides an overview of the state-of-the-art of
differentially private centralized deep learning, thorough analyses of recent
advances and open problems, as well as a discussion of potential future
developments in the field. Based on a systematic literature review, the
following topics are addressed: auditing and evaluation methods for private
models, improvements of privacy-utility trade-offs, protection against a broad
range of threats and attacks, differentially private generative models, and
emerging application domains.Comment: 35 pages, 2 figure
Retiring DP: New Distribution-Level Metrics for Demographic Parity
Demographic parity is the most widely recognized measure of group fairness in
machine learning, which ensures equal treatment of different demographic
groups. Numerous works aim to achieve demographic parity by pursuing the
commonly used metric . Unfortunately, in this paper, we reveal that
the fairness metric can not precisely measure the violation of
demographic parity, because it inherently has the following drawbacks: i)
zero-value does not guarantee zero violation of demographic parity,
ii) values can vary with different classification thresholds. To
this end, we propose two new fairness metrics, Area Between Probability density
function Curves (ABPC) and Area Between Cumulative density function Curves
(ABCC), to precisely measure the violation of demographic parity at the
distribution level. The new fairness metrics directly measure the difference
between the distributions of the prediction probability for different
demographic groups. Thus our proposed new metrics enjoy: i) zero-value
ABCC/ABPC guarantees zero violation of demographic parity; ii) ABCC/ABPC
guarantees demographic parity while the classification thresholds are adjusted.
We further re-evaluate the existing fair models with our proposed fairness
metrics and observe different fairness behaviors of those models under the new
metrics. The code is available at
https://github.com/ahxt/new_metric_for_demographic_parityComment: Accepted by TMLR. Code available at
https://github.com/ahxt/new_metric_for_demographic_parit
- …