115,716 research outputs found
The Role of Randomness and Noise in Strategic Classification
We investigate the problem of designing optimal classifiers in the strategic
classification setting, where the classification is part of a game in which
players can modify their features to attain a favorable classification outcome
(while incurring some cost). Previously, the problem has been considered from a
learning-theoretic perspective and from the algorithmic fairness perspective.
Our main contributions include 1. Showing that if the objective is to maximize
the efficiency of the classification process (defined as the accuracy of the
outcome minus the sunk cost of the qualified players manipulating their
features to gain a better outcome), then using randomized classifiers (that is,
ones where the probability of a given feature vector to be accepted by the
classifier is strictly between 0 and 1) is necessary. 2. Showing that in many
natural cases, the imposed optimal solution (in terms of efficiency) has the
structure where players never change their feature vectors (the randomized
classifier is structured in a way, such that the gain in the probability of
being classified as a 1 does not justify the expense of changing one's
features). 3. Observing that the randomized classification is not a stable
best-response from the classifier's viewpoint, and that the classifier doesn't
benefit from randomized classifiers without creating instability in the system.
4. Showing that in some cases, a noisier signal leads to better equilibria
outcomes -- improving both accuracy and fairness when more than one
subpopulation with different feature adjustment costs are involved. This is
interesting from a policy perspective, since it is hard to force institutions
to stick to a particular randomized classification strategy (especially in a
context of a market with multiple classifiers), but it is possible to alter the
information environment to make the feature signals inherently noisier.Comment: 22 pages. Appeared in FORC, 202
Fair and Optimal Classification via Transports to Wasserstein-Barycenter
Fairness in automated decision-making systems has gained increasing attention
as their applications expand to real-world high-stakes domains. To facilitate
the design of fair ML systems, it is essential to understand the potential
trade-offs between fairness and predictive power, and the construction of the
optimal predictor under a given fairness constraint. In this paper, for general
classification problems under the group fairness criterion of demographic
parity (DP), we precisely characterize the trade-off between DP and
classification accuracy, referred to as the minimum cost of fairness. Our
insight comes from the key observation that finding the optimal fair classifier
is equivalent to solving a Wasserstein-barycenter problem under -norm
restricted to the vertices of the probability simplex. Inspired by our
characterization, we provide a construction of an optimal fair classifier
achieving this minimum cost via the composition of the Bayes regressor and
optimal transports from its output distributions to the barycenter. Our
construction naturally leads to an algorithm for post-processing any
pre-trained predictor to satisfy DP fairness, complemented with finite sample
guarantees. Experiments on real-world datasets verify and demonstrate the
effectiveness of our approaches.Comment: Code is at https://github.com/rxian/fair-classificatio
Certifying the Fairness of KNN in the Presence of Dataset Bias
We propose a method for certifying the fairness of the classification result
of a widely used supervised learning algorithm, the k-nearest neighbors (KNN),
under the assumption that the training data may have historical bias caused by
systematic mislabeling of samples from a protected minority group. To the best
of our knowledge, this is the first certification method for KNN based on three
variants of the fairness definition: individual fairness, -fairness,
and label-flipping fairness. We first define the fairness certification problem
for KNN and then propose sound approximations of the complex arithmetic
computations used in the state-of-the-art KNN algorithm. This is meant to lift
the computation results from the concrete domain to an abstract domain, to
reduce the computational cost. We show effectiveness of this abstract
interpretation based technique through experimental evaluation on six datasets
widely used in the fairness research literature. We also show that the method
is accurate enough to obtain fairness certifications for a large number of test
inputs, despite the presence of historical bias in the datasets
CAFIN: Centrality Aware Fairness inducing IN-processing for Unsupervised Representation Learning on Graphs
Unsupervised representation learning on (large) graphs has received
significant attention in the research community due to the compactness and
richness of the learned embeddings and the abundance of unlabelled graph data.
When deployed, these node representations must be generated with appropriate
fairness constraints to minimize bias induced by them on downstream tasks.
Consequently, group and individual fairness notions for graph learning
algorithms have been investigated for specific downstream tasks. One major
limitation of these fairness notions is that they do not consider the
connectivity patterns in the graph leading to varied node influence (or
centrality power). In this paper, we design a centrality-aware fairness
framework for inductive graph representation learning algorithms. We propose
CAFIN (Centrality Aware Fairness inducing IN-processing), an in-processing
technique that leverages graph structure to improve GraphSAGE's representations
- a popular framework in the unsupervised inductive setting. We demonstrate the
efficacy of CAFIN in the inductive setting on two popular downstream tasks -
Link prediction and Node Classification. Empirically, they consistently
minimize the disparity in fairness between groups across datasets (varying from
18 to 80% reduction in imparity, a measure of group fairness) from different
domains while incurring only a minimal performance cost
A Differentiable Distance Approximation for Fairer Image Classification
Naively trained AI models can be heavily biased. This can be particularly
problematic when the biases involve legally or morally protected attributes
such as ethnic background, age or gender. Existing solutions to this problem
come at the cost of extra computation, unstable adversarial optimisation or
have losses on the feature space structure that are disconnected from fairness
measures and only loosely generalise to fairness. In this work we propose a
differentiable approximation of the variance of demographics, a metric that can
be used to measure the bias, or unfairness, in an AI model. Our approximation
can be optimised alongside the regular training objective which eliminates the
need for any extra models during training and directly improves the fairness of
the regularised models. We demonstrate that our approach improves the fairness
of AI models in varied task and dataset scenarios, whilst still maintaining a
high level of classification accuracy. Code is available at
https://bitbucket.org/nelliottrosa/base_fairness
Equal Opportunity in Online Classification with Partial Feedback
We study an online classification problem with partial feedback in which
individuals arrive one at a time from a fixed but unknown distribution, and
must be classified as positive or negative. Our algorithm only observes the
true label of an individual if they are given a positive classification. This
setting captures many classification problems for which fairness is a concern:
for example, in criminal recidivism prediction, recidivism is only observed if
the inmate is released; in lending applications, loan repayment is only
observed if the loan is granted. We require that our algorithms satisfy common
statistical fairness constraints (such as equalizing false positive or negative
rates -- introduced as "equal opportunity" in Hardt et al. (2016)) at every
round, with respect to the underlying distribution. We give upper and lower
bounds characterizing the cost of this constraint in terms of the regret rate
(and show that it is mild), and give an oracle efficient algorithm that
achieves the upper bound.Comment: The Conference version of this paper appears in the Proceedings of
NeurIPS 2019. 29 page
Rule Generation for Classification: Scalability, Interpretability, and Fairness
We introduce a new rule-based optimization method for classification with
constraints. The proposed method leverages column generation for linear
programming, and hence, is scalable to large datasets. The resulting pricing
subproblem is shown to be NP-Hard. We recourse to a decision tree-based
heuristic and solve a proxy pricing subproblem for acceleration. The method
returns a set of rules along with their optimal weights indicating the
importance of each rule for learning. We address interpretability and fairness
by assigning cost coefficients to the rules and introducing additional
constraints. In particular, we focus on local interpretability and generalize
separation criterion in fairness to multiple sensitive attributes and classes.
We test the performance of the proposed methodology on a collection of datasets
and present a case study to elaborate on its different aspects. The proposed
rule-based learning method exhibits a good compromise between local
interpretability and fairness on the one side, and accuracy on the other side
Bias mitigation with AIF360: A comparative study
The use of artificial intelligence for decision making raises concerns about the societal impact of such systems. Traditionally, the product of a human decision-maker are governed by laws and human values. Decision-making is now being guided - or in some cases, replaced by machine learning classification which may reinforce and introduce bias. Algorithmic bias mitigation is explored as an approach to avoid this, however it does come at a cost: efficiency and accuracy. We conduct an empirical analysis of two off-the-shelf bias mitigation techniques from the AIF360 toolkit on a binary classification task. Our preliminary results indicate that bias mitigation is a feasible approach to ensuring group fairness
- …