6,434 research outputs found
Individual Fairness in Pipelines
It is well understood that a system built from individually fair components
may not itself be individually fair. In this work, we investigate individual
fairness under pipeline composition. Pipelines differ from ordinary sequential
or repeated composition in that individuals may drop out at any stage, and
classification in subsequent stages may depend on the remaining "cohort" of
individuals. As an example, a company might hire a team for a new project and
at a later point promote the highest performer on the team. Unlike other
repeated classification settings, where the degree of unfairness degrades
gracefully over multiple fair steps, the degree of unfairness in pipelines can
be arbitrary, even in a pipeline with just two stages.
Guided by a panoply of real-world examples, we provide a rigorous framework
for evaluating different types of fairness guarantees for pipelines. We show
that na\"{i}ve auditing is unable to uncover systematic unfairness and that, in
order to ensure fairness, some form of dependence must exist between the design
of algorithms at different stages in the pipeline. Finally, we provide
constructions that permit flexibility at later stages, meaning that there is no
need to lock in the entire pipeline at the time that the early stage is
constructed
Fairness-Aware Instrumentation of Preprocessing Pipelines for Machine Learning
Surfacing and mitigating bias in ML pipelines is a complex topic, with a dire need to provide system-level support to data scientists. Humans should be empowered to debug these pipelines, in order to control for bias and to improve data quality and representativeness. We propose fair-DAGs, an open-source library that extracts directed acyclic graph (DAG) representations of the data flow in preprocessing pipelines for ML. The library subsequently instruments the pipelines with tracing and visualization code to capture changes in data distributions and identify distortions with respect to protected group membership as the data travels through the pipeline. We illustrate the utility of fair-DAGs with experiments on publicly available ML pipelines
- …