1,716 research outputs found
Fair Inference On Outcomes
In this paper, we consider the problem of fair statistical inference
involving outcome variables. Examples include classification and regression
problems, and estimating treatment effects in randomized trials or
observational data. The issue of fairness arises in such problems where some
covariates or treatments are "sensitive," in the sense of having potential of
creating discrimination. In this paper, we argue that the presence of
discrimination can be formalized in a sensible way as the presence of an effect
of a sensitive covariate on the outcome along certain causal pathways, a view
which generalizes (Pearl, 2009). A fair outcome model can then be learned by
solving a constrained optimization problem. We discuss a number of
complications that arise in classical statistical inference due to this view
and provide workarounds based on recent work in causal and semi-parametric
inference
Is Algorithmic Affirmative Action Legal?
This Article is the first to comprehensively explore whether algorithmic affirmative action is lawful. It concludes that both statutory and constitutional antidiscrimination law leave room for race-aware affirmative action in the design of fair algorithms. Along the way, the Article recommends some clarifications of current doctrine and proposes the pursuit of formally race-neutral methods to achieve the admittedly race-conscious goals of algorithmic affirmative action.
The Article proceeds as follows. Part I introduces algorithmic affirmative action. It begins with a brief review of the bias problem in machine learning and then identifies multiple design options for algorithmic fairness. These designs are presented at a theoretical level, rather than in formal mathematical detail. It also highlights some difficult truths that stakeholders, jurists, and legal scholars must understand about accuracy and fairness trade-offs inherent in fairness solutions. Part II turns to the legality of algorithmic affirmative action, beginning with the statutory challenge under Title VII of the Civil Rights Act. Part II argues that voluntary algorithmic affirmative action ought to survive a disparate treatment challenge under Ricci and under the antirace-norming provision of Title VII. Finally, Part III considers the constitutional challenge to algorithmic affirmative action by state actors. It concludes that at least some forms of algorithmic affirmative action, to the extent they are racial classifications at all, ought to survive strict scrutiny as narrowly tailored solutions designed to mitigate the effects of past discrimination
The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning
The nascent field of fair machine learning aims to ensure that decisions
guided by algorithms are equitable. Over the last several years, three formal
definitions of fairness have gained prominence: (1) anti-classification,
meaning that protected attributes---like race, gender, and their proxies---are
not explicitly used to make decisions; (2) classification parity, meaning that
common measures of predictive performance (e.g., false positive and false
negative rates) are equal across groups defined by the protected attributes;
and (3) calibration, meaning that conditional on risk estimates, outcomes are
independent of protected attributes. Here we show that all three of these
fairness definitions suffer from significant statistical limitations. Requiring
anti-classification or classification parity can, perversely, harm the very
groups they were designed to protect; and calibration, though generally
desirable, provides little guarantee that decisions are equitable. In contrast
to these formal fairness criteria, we argue that it is often preferable to
treat similarly risky people similarly, based on the most statistically
accurate estimates of risk that one can produce. Such a strategy, while not
universally applicable, often aligns well with policy objectives; notably, this
strategy will typically violate both anti-classification and classification
parity. In practice, it requires significant effort to construct suitable risk
estimates. One must carefully define and measure the targets of prediction to
avoid retrenching biases in the data. But, importantly, one cannot generally
address these difficulties by requiring that algorithms satisfy popular
mathematical formalizations of fairness. By highlighting these challenges in
the foundation of fair machine learning, we hope to help researchers and
practitioners productively advance the area
FairCanary: Rapid Continuous Explainable Fairness
Machine Learning (ML) models are being used in all facets of today's society
to make high stake decisions like bail granting or credit lending, with very
minimal regulations. Such systems are extremely vulnerable to both propagating
and amplifying social biases, and have therefore been subject to growing
research interest. One of the main issues with conventional fairness metrics is
their narrow definitions which hide the complete extent of the bias by focusing
primarily on positive and/or negative outcomes, whilst not paying attention to
the overall distributional shape. Moreover, these metrics are often
contradictory to each other, are severely restrained by the contextual and
legal landscape of the problem, have technical constraints like poor support
for continuous outputs, the requirement of class labels, and are not
explainable.
In this paper, we present Quantile Demographic Drift, which addresses the
shortcomings mentioned above. This metric can also be used to measure
intra-group privilege. It is easily interpretable via existing attribution
techniques, and also extends naturally to individual fairness via the principle
of like-for-like comparison. We make this new fairness score the basis of a new
system that is designed to detect bias in production ML models without the need
for labels. We call the system FairCanary because of its capability to detect
bias in a live deployed model and narrow down the alert to the responsible set
of features, like the proverbial canary in a coal mine
Abstracting Fairness: Oracles, Metrics, and Interpretability
It is well understood that classification algorithms, for example, for
deciding on loan applications, cannot be evaluated for fairness without taking
context into account. We examine what can be learned from a fairness oracle
equipped with an underlying understanding of ``true'' fairness. The oracle
takes as input a (context, classifier) pair satisfying an arbitrary fairness
definition, and accepts or rejects the pair according to whether the classifier
satisfies the underlying fairness truth. Our principal conceptual result is an
extraction procedure that learns the underlying truth; moreover, the procedure
can learn an approximation to this truth given access to a weak form of the
oracle. Since every ``truly fair'' classifier induces a coarse metric, in which
those receiving the same decision are at distance zero from one another and
those receiving different decisions are at distance one, this extraction
process provides the basis for ensuring a rough form of metric fairness, also
known as individual fairness. Our principal technical result is a higher
fidelity extractor under a mild technical constraint on the weak oracle's
conception of fairness. Our framework permits the scenario in which many
classifiers, with differing outcomes, may all be considered fair. Our results
have implications for interpretablity -- a highly desired but poorly defined
property of classification systems that endeavors to permit a human arbiter to
reject classifiers deemed to be ``unfair'' or illegitimately derived.Comment: 17 pages, 1 figur
- …