26 research outputs found
Achieving non-discrimination in prediction
Discrimination-aware classification is receiving an increasing attention in
data science fields. The pre-process methods for constructing a
discrimination-free classifier first remove discrimination from the training
data, and then learn the classifier from the cleaned data. However, they lack a
theoretical guarantee for the potential discrimination when the classifier is
deployed for prediction. In this paper, we fill this gap by mathematically
bounding the probability of the discrimination in prediction being within a
given interval in terms of the training data and classifier. We adopt the
causal model for modeling the data generation mechanism, and formally defining
discrimination in population, in a dataset, and in prediction. We obtain two
important theoretical results: (1) the discrimination in prediction can still
exist even if the discrimination in the training data is completely removed;
and (2) not all pre-process methods can ensure non-discrimination in prediction
even though they can achieve non-discrimination in the modified training data.
Based on the results, we develop a two-phase framework for constructing a
discrimination-free classifier with a theoretical guarantee. The experiments
demonstrate the theoretical results and show the effectiveness of our two-phase
framework
Fairness in Algorithmic Decision Making: An Excursion Through the Lens of Causality
As virtually all aspects of our lives are increasingly impacted by
algorithmic decision making systems, it is incumbent upon us as a society to
ensure such systems do not become instruments of unfair discrimination on the
basis of gender, race, ethnicity, religion, etc. We consider the problem of
determining whether the decisions made by such systems are discriminatory,
through the lens of causal models. We introduce two definitions of group
fairness grounded in causality: fair on average causal effect (FACE), and fair
on average causal effect on the treated (FACT). We use the Rubin-Neyman
potential outcomes framework for the analysis of cause-effect relationships to
robustly estimate FACE and FACT. We demonstrate the effectiveness of our
proposed approach on synthetic data. Our analyses of two real-world data sets,
the Adult income data set from the UCI repository (with gender as the protected
attribute), and the NYC Stop and Frisk data set (with race as the protected
attribute), show that the evidence of discrimination obtained by FACE and FACT,
or lack thereof, is often in agreement with the findings from other studies. We
further show that FACT, being somewhat more nuanced compared to FACE, can yield
findings of discrimination that differ from those obtained using FACE.Comment: 7 pages, 2 figures, 2 tables.To appear in Proceedings of the
International Conference on World Wide Web (WWW), 201
Autonomous Vehicles for All?
The traditional build-and-expand approach is not a viable solution to keep
roadway traffic rolling safely, so technological solutions, such as Autonomous
Vehicles (AVs), are favored. AVs have considerable potential to increase the
carrying capacity of roads, ameliorate the chore of driving, improve safety,
provide mobility for those who cannot drive, and help the environment. However,
they also raise concerns over whether they are socially responsible, accounting
for issues such as fairness, equity, and transparency. Regulatory bodies have
focused on AV safety, cybersecurity, privacy, and legal liability issues, but
have failed to adequately address social responsibility. Thus, existing AV
developers do not have to embed social responsibility factors in their
proprietary technology. Adverse bias may therefore occur in the development and
deployment of AV technology. For instance, an artificial intelligence-based
pedestrian detection application used in an AV may, in limited lighting
conditions, be biased to detect pedestrians who belong to a particular racial
demographic more efficiently compared to pedestrians from other racial
demographics. Also, AV technologies tend to be costly, with a unique hardware
and software setup which may be beyond the reach of lower-income people. In
addition, data generated by AVs about their users may be misused by third
parties such as corporations, criminals, or even foreign governments. AVs
promise to dramatically impact labor markets, as many jobs that involve driving
will be made redundant. We argue that the academic institutions, industry, and
government agencies overseeing AV development and deployment must act
proactively to ensure that AVs serve all and do not increase the digital divide
in our society
Fair Inference On Outcomes
In this paper, we consider the problem of fair statistical inference
involving outcome variables. Examples include classification and regression
problems, and estimating treatment effects in randomized trials or
observational data. The issue of fairness arises in such problems where some
covariates or treatments are "sensitive," in the sense of having potential of
creating discrimination. In this paper, we argue that the presence of
discrimination can be formalized in a sensible way as the presence of an effect
of a sensitive covariate on the outcome along certain causal pathways, a view
which generalizes (Pearl, 2009). A fair outcome model can then be learned by
solving a constrained optimization problem. We discuss a number of
complications that arise in classical statistical inference due to this view
and provide workarounds based on recent work in causal and semi-parametric
inference
Fair CRISP-DM: Embedding Fairness in Machine Learning (ML) Development Life Cycle
With rapid adoption of machine learning (ML) technologies, the organizations are constantly exploring for efficient processes to develop such technologies. Cross-industry standard process for data mining (CRISP-DM) provides an industry and technology independent model for organizing ML projects’ development. However, the model lacks fairness concerns related to ML technologies. To address this important theoretical and practical gap in the literature, we propose a new model – Fair CRISP-DM which categorizes and presents the relevant fairness challenges in each phase of project development. We contribute to the literature on ML development and fairness. Specifically, ML researchers and practitioners can adopt our model to check and mitigate fairness concerns in each phase of ML project development