109 research outputs found
Fair Clustering Through Fairlets
We study the question of fair clustering under the {\em disparate impact}
doctrine, where each protected class must have approximately equal
representation in every cluster. We formulate the fair clustering problem under
both the -center and the -median objectives, and show that even with two
protected classes the problem is challenging, as the optimum solution can
violate common conventions---for instance a point may no longer be assigned to
its nearest cluster center! En route we introduce the concept of fairlets,
which are minimal sets that satisfy fair representation while approximately
preserving the clustering objective. We show that any fair clustering problem
can be decomposed into first finding good fairlets, and then using existing
machinery for traditional clustering algorithms. While finding good fairlets
can be NP-hard, we proceed to obtain efficient approximation algorithms based
on minimum cost flow. We empirically quantify the value of fair clustering on
real-world datasets with sensitive attributes
Service in Your Neighborhood: Fairness in Center Location
When selecting locations for a set of centers, standard clustering algorithms may place unfair burden on some individuals and neighborhoods. We formulate a fairness concept that takes local population densities into account. In particular, given k centers to locate and a population of size n, we define the "neighborhood radius" of an individual i as the minimum radius of a ball centered at i that contains at least n/k individuals. Our objective is to ensure that each individual has a center that is within at most a small constant factor of her neighborhood radius.
We present several theoretical results: We show that optimizing this factor is NP-hard; we give an approximation algorithm that guarantees a factor of at most 2 in all metric spaces; and we prove matching lower bounds in some metric spaces. We apply a variant of this algorithm to real-world address data, showing that it is quite different from standard clustering algorithms and outperforms them on our objective function and balances the load between centers more evenly
Whither Fair Clustering?
Within the relatively busy area of fair machine learning that has been
dominated by classification fairness research, fairness in clustering has
started to see some recent attention. In this position paper, we assess the
existing work in fair clustering and observe that there are several directions
that are yet to be explored, and postulate that the state-of-the-art in fair
clustering has been quite parochial in outlook. We posit that widening the
normative principles to target for, characterizing shortfalls where the target
cannot be achieved fully, and making use of knowledge of downstream processes
can significantly widen the scope of research in fair clustering research. At a
time when clustering and unsupervised learning are being increasingly used to
make and influence decisions that matter significantly to human lives, we
believe that widening the ambit of fair clustering is of immense significance.Comment: Accepted at the AI for Social Good Workshop, Harvard, July 20-21,
202
Towards Algorithmic Fairness in Space-Time: Filling in Black Holes
New technologies and the availability of geospatial data have drawn attention
to spatio-temporal biases present in society. For example: the COVID-19
pandemic highlighted disparities in the availability of broadband service and
its role in the digital divide; the environmental justice movement in the
United States has raised awareness to health implications for minority
populations stemming from historical redlining practices; and studies have
found varying quality and coverage in the collection and sharing of open-source
geospatial data. Despite the extensive literature on machine learning (ML)
fairness, few algorithmic strategies have been proposed to mitigate such
biases. In this paper we highlight the unique challenges for quantifying and
addressing spatio-temporal biases, through the lens of use cases presented in
the scientific literature and media. We envision a roadmap of ML strategies
that need to be developed or adapted to quantify and overcome these challenges
-- including transfer learning, active learning, and reinforcement learning
techniques. Further, we discuss the potential role of ML in providing guidance
to policy makers on issues related to spatial fairness
- …