Search CORE

40 research outputs found

Distill-and-Compare: Auditing Black-Box Models Using Transparent Model Distillation

Author: Adebayo Julius
Chouldechova Alexandra
Datta A.
Hastie Trevor J
Hinton Geoffrey
Kim Michael P
Tramer Florian
Wang Hao
Zhang Zhe
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/10/2018
Field of study

Black-box risk scoring models permeate our lives, yet are typically proprietary or opaque. We propose Distill-and-Compare, a model distillation and comparison approach to audit such models. To gain insight into black-box models, we treat them as teachers, training transparent student models to mimic the risk scores assigned by black-box models. We compare the student model trained with distillation to a second un-distilled transparent model trained on ground-truth outcomes, and use differences between the two models to gain insight into the black-box model. Our approach can be applied in a realistic setting, without probing the black-box model API. We demonstrate the approach on four public data sets: COMPAS, Stop-and-Frisk, Chicago Police, and Lending Club. We also propose a statistical test to determine if a data set is missing key features used to train the black-box model. Our test finds that the ProPublica data is likely missing key feature(s) used in COMPAS.Comment: Camera-ready version for AAAI/ACM AIES 2018. Data and pseudocode at https://github.com/shftan/auditblackbox. Previously titled "Detecting Bias in Black-Box Models Using Transparent Model Distillation". A short version was presented at NIPS 2017 Symposium on Interpretable Machine Learnin

arXiv.org e-Print Archive

Crossref

Case Study: Predictive Fairness to Reduce Misdemeanor Recidivism Through Social Service Interventions

Author: Agarwal Alekh
Chouldechova Alexandra
Demleitner Nora V.
Dwork Cynthia
Huq Aziz Z.
Hyndman Rob J
Kim Pauline T.
Kim Pauline T.
Kondo L. L.
Kroll Joshua A.
MacCarthy Mark
Mayson Sandra G
Stone T. Howard
Taslitz Andrew E.
Zafar Muhammad Bilal
Zemel Rich
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/01/2020
Field of study

The criminal justice system is currently ill-equipped to improve outcomes of individuals who cycle in and out of the system with a series of misdemeanor offenses. Often due to constraints of caseload and poor record linkage, prior interactions with an individual may not be considered when an individual comes back into the system, let alone in a proactive manner through the application of diversion programs. The Los Angeles City Attorney's Office recently created a new Recidivism Reduction and Drug Diversion unit (R2D2) tasked with reducing recidivism in this population. Here we describe a collaboration with this new unit as a case study for the incorporation of predictive equity into machine learning based decision making in a resource-constrained setting. The program seeks to improve outcomes by developing individually-tailored social service interventions (i.e., diversions, conditional plea agreements, stayed sentencing, or other favorable case disposition based on appropriate social service linkage rather than traditional sentencing methods) for individuals likely to experience subsequent interactions with the criminal justice system, a time and resource-intensive undertaking that necessitates an ability to focus resources on individuals most likely to be involved in a future case. Seeking to achieve both efficiency (through predictive accuracy) and equity (improving outcomes in traditionally under-served communities and working to mitigate existing disparities in criminal justice outcomes), we discuss the equity outcomes we seek to achieve, describe the corresponding choice of a metric for measuring predictive fairness in this context, and explore a set of options for balancing equity and efficiency when building and selecting machine learning models in an operational public policy setting.Comment: 12 pages, 4 figures, 1 algorithm. The definitive Version of Record will be published in the proceedings of the Conference on Fairness, Accountability, and Transparency (FAT* '20), January 27-30, 2020, Barcelona, Spai

arXiv.org e-Print Archive

Crossref

Learning a formula of interpretability to learn interpretable formulas

Author: A Adadi
A Cano
A Chouldechova
A Ekárt
A Meurer
AB Arrieta
AS Sambo
B Goodman
B Tran
BT Zhang
C Rudin
CM Bishop
D Hein
DR White
EJ Vladislavleva
F Pedregosa
G Squillero
GF Smits
H Zhao
H Zou
J Demšar
J McCormack
K Deb
M Keijzer
M Keijzer
M Maruyama
M Virgolin
P Wang
Q Chen
R Guidotti
R Poli
R Poli
S Holm
S Ruberto
S Silva
W Wang
Y Liang
ZC Lipton
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Many risk-sensitive applications require Machine Learning (ML) models to be interpretable. Attempts to obtain interpretable models typically rely on tuning, by trial-and-error, hyper-parameters of model complexity that are only loosely related to interpretability. We show that it is instead possible to take a meta-learning approach: an ML model of non-trivial Proxies of Human Interpretability (PHIs) can be learned from human feedback, then this model can be incorporated within an ML training process to directly optimize for interpretability. We show this for evolutionary symbolic regression. We first design and distribute a survey finalized at finding a link between features of mathematical formulas and two established PHIs, simulatability and decomposability. Next, we use the resulting dataset to learn an ML model of interpretability. Lastly, we query this model to estimate the interpretability of evolving solutions within bi-objective genetic programming. We perform experiments on five synthetic and eight real-world symbolic regression problems, comparing to the traditional use of solution size minimization. The results show that the use of our model leads to formulas that are, for a same level of accuracy-interpretability trade-off, either significantly more or equally accurate. Moreover, the formulas are also arguably more interpretable. Given the very positive results, we believe that our approach represents an important stepping stone for the design of next-generation interpretable (evolutionary) ML algorithms

Archivio istituzionale della ricerca - Università di Trieste

arXiv.org e-Print Archive

Crossref

CWI's Institutional Repository

Reluctant Generalised Additive Modelling

Author: Chouldechova A.
Efron B.
Sadhanala V.
Yu G.
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

Towards Formal Fairness in Machine Learning

Author: A Chouldechova
A Chouldechova
A Ignatiev
AP Kamath
C Bessiere
D Maliotov
F Pedregosa
G Katz
G Katz
J Demsar
L de Moura
L Pulina
M Wu
N Eén
R Ehlers
S Verwer
X Huang
Publication venue: Springer
Publication date: 01/01/2020
Field of study

International audienceOne of the challenges of deploying machine learning (ML) systems is fairness. Datasets often include sensitive features, which ML algorithms may unwittingly use to create models that exhibit unfairness. Past work on fairness offers no formal guarantees in their results. This paper proposes to exploit formal reasoning methods to tackle fairness. Starting from an intuitive criterion for fairness of an ML model, the paper formalises it, and shows how fairness can be represented as a decision problem, given some logic representation of an ML model. The same criterion can also be applied to assessing bias in training data. Moreover, we propose a reasonable set of axiomatic properties which no other definition of dataset bias can satisfy. The paper also investigates the relationship between fairness and explainability, and shows that approaches for computing explanations can serve to assess fairness of particular predictions. Finally, the paper proposes SAT-based approaches for learning fair ML models, even when the training data exhibits bias, and reports experimental trials

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

HAL Descartes

HAL-INSA Toulouse

Monash University Research Portal

Technological Workforce and Its Impact on Algorithmic Justice in Politics

Author: A Chouldechova
Alan Hyde
H Moritz
M Bertrand
R Brauneis
RD Godsil
S Hoffman
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

FairFace Challenge at ECCV 2020:Analyzing Bias in Face Recognition

Author: A Chouldechova
E Learned-Miller
G Guo
J Pearl
R Rothe
S Bino
S Lo Piano
U Jayaraman
Publication venue: Springer
Publication date: 01/01/2020
Field of study

This work summarizes the 2020 ChaLearn Looking at People Fair Face Recognition and Analysis Challenge and provides a description of the top-winning solutions and analysis of the results. The aim of the challenge was to evaluate accuracy and bias in gender and skin colour of submitted algorithms on the task of 1:1 face verification in the presence of other confounding attributes. Participants were evaluated using an in-the-wild dataset based on reannotated IJB-C, further enriched 12.5K new images and additional labels. The dataset is not balanced, which simulates a real world scenario where AI-based models supposed to present fair outcomes are trained and evaluated on imbalanced data. The challenge attracted 151 participants, who made more 1.8K submissions in total. The final phase of the challenge attracted 36 active teams out of which 10 exceeded 0.999 AUC-ROC while achieving very low scores in the proposed bias metrics. Common strategies by the participants were face pre-processing, homogenization of data distributions, the use of bias aware loss functions and ensemble models. The analysis of top-10 teams shows higher false positive rates (and lower false negative rates) for females with dark skin tone as well as the potential of eyeglasses and young age to increase the false positive rates too.</p

arXiv.org e-Print Archive

Queen's University Belfast Research Portal

Crossref

VBN

FairFace Challenge at ECCV 2020:Analyzing Bias in Face Recognition

Author: A Chouldechova
E Learned-Miller
G Guo
J Pearl
R Rothe
S Bino
S Lo Piano
U Jayaraman
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Crossref

VBN

Online sequential monitoring of spatio-temporal disease incidence rates

Author: Banerjee S.
Brabanter K.D.
Chouldechova A.
Cressie N.
Diggle P.J
Fiore A.E.
Kite-Powell A.
Last J.M
Qiu P
Woodall W.H.
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref

Fair regression for health care spending

Author: Berk R.
Chouldechova A.
Dwork C.
Fu A.
Hardt M.
Kautter J.
McGuire T.
Pope G.C.
Zafar M.B.
Zemel R.
Publication venue: 'Wiley'
Publication date
Field of study

Crossref