4,366 research outputs found
Proportionally Representative Clustering
In recent years, there has been a surge in effort to formalize notions of
fairness in machine learning. We focus on clustering -- one of the fundamental
tasks in unsupervised machine learning. We propose a new axiom ``proportional
representation fairness'' (PRF) that is designed for clustering problems where
the selection of centroids reflects the distribution of data points and how
tightly they are clustered together. Our fairness concept is not satisfied by
existing fair clustering algorithms. We design efficient algorithms to achieve
PRF both for unconstrained and discrete clustering problems. Our algorithm for
the unconstrained setting is also the first known polynomial-time approximation
algorithm for the well-studied Proportional Fairness (PF) axiom (Chen, Fain,
Lyu, and Munagala, ICML, 2019). Our algorithm for the discrete setting also
matches the best known approximation factor for PF.Comment: Revised version includes a new author (Jeremy Vollen) and new
results: Our algorithm for the unconstrained setting is also the first known
polynomial-time approximation algorithm for the well-studied Proportional
Fairness (PF) axiom (Chen, Fain, Lyu, and Munagala, ICML, 2019). Our
algorithm for the discrete setting also matches the best known approximation
factor for P
Service in Your Neighborhood: Fairness in Center Location
When selecting locations for a set of centers, standard clustering algorithms may place unfair burden on some individuals and neighborhoods. We formulate a fairness concept that takes local population densities into account. In particular, given k centers to locate and a population of size n, we define the "neighborhood radius" of an individual i as the minimum radius of a ball centered at i that contains at least n/k individuals. Our objective is to ensure that each individual has a center that is within at most a small constant factor of her neighborhood radius.
We present several theoretical results: We show that optimizing this factor is NP-hard; we give an approximation algorithm that guarantees a factor of at most 2 in all metric spaces; and we prove matching lower bounds in some metric spaces. We apply a variant of this algorithm to real-world address data, showing that it is quite different from standard clustering algorithms and outperforms them on our objective function and balances the load between centers more evenly
Proportional Fairness in Clustering: A Social Choice Perspective
We study the proportional clustering problem of Chen et al. [ICML'19] and
relate it to the area of multiwinner voting in computational social choice. We
show that any clustering satisfying a weak proportionality notion of Brill and
Peters [EC'23] simultaneously obtains the best known approximations to the
proportional fairness notion of Chen et al. [ICML'19], but also to individual
fairness [Jung et al., FORC'20] and the "core" [Li et al. ICML'21]. In fact, we
show that any approximation to proportional fairness is also an approximation
to individual fairness and vice versa. Finally, we also study stronger notions
of proportional representation, in which deviations do not only happen to
single, but multiple candidate centers, and show that stronger proportionality
notions of Brill and Peters [EC'23] imply approximations to these stronger
guarantees
Approximation Algorithms for Fair Range Clustering
This paper studies the fair range clustering problem in which the data points
are from different demographic groups and the goal is to pick centers with
the minimum clustering cost such that each group is at least minimally
represented in the centers set and no group dominates the centers set. More
precisely, given a set of points in a metric space where each point
belongs to one of the different demographics (i.e., ) and a set of intervals on desired number of centers from
each group, the goal is to pick a set of centers with minimum
-clustering cost (i.e., ) such that for
each group , . In particular,
the fair range -clustering captures fair range -center, -median
and -means as its special cases. In this work, we provide efficient constant
factor approximation algorithms for fair range -clustering for all
values of .Comment: ICML 202
Changes in epidemiological patterns of sea lice infestation on farmed Atlantic salmon, Salmo salar L., in Scotland between 1996 and 2006
Analyses of a unique database containing sea lice records over an 11 year period provide evidence of changing infestation patterns in Scotland. The data, collected from more than 50 commercial Atlantic salmon farms, indicate that both species of sea lice commonly found in Scotland, Lepeophtheirus salmonis and Caligus elongatus, have declined on farms over the past decade. Reductions for both species have been particularly marked since 2001 when more effective veterinary medicines became available. Treatment data were also available in the database and these show a growing trend towards the use of the in feed medication emamectin benzoate (Slice), particularly in the first year of the salmon production cycle. However, this trend to wards single product use has not been sustained in 2006, the latest year for which data are available. There is some evidence of region to region variation within Scotland with the Western Isles experiencing higher levels of infestation. However, compared to the levels observed between 1996 and 2000, all regions have benefited from reduced lice infestation, with the overall pattern showing a particular reduction in the second and third quarters of the second year of production
Role of homeostasis in learning sparse representations
Neurons in the input layer of primary visual cortex in primates develop
edge-like receptive fields. One approach to understanding the emergence of this
response is to state that neural activity has to efficiently represent sensory
data with respect to the statistics of natural scenes. Furthermore, it is
believed that such an efficient coding is achieved using a competition across
neurons so as to generate a sparse representation, that is, where a relatively
small number of neurons are simultaneously active. Indeed, different models of
sparse coding, coupled with Hebbian learning and homeostasis, have been
proposed that successfully match the observed emergent response. However, the
specific role of homeostasis in learning such sparse representations is still
largely unknown. By quantitatively assessing the efficiency of the neural
representation during learning, we derive a cooperative homeostasis mechanism
that optimally tunes the competition between neurons within the sparse coding
algorithm. We apply this homeostasis while learning small patches taken from
natural images and compare its efficiency with state-of-the-art algorithms.
Results show that while different sparse coding algorithms give similar coding
results, the homeostasis provides an optimal balance for the representation of
natural images within the population of neurons. Competition in sparse coding
is optimized when it is fair. By contributing to optimizing statistical
competition across neurons, homeostasis is crucial in providing a more
efficient solution to the emergence of independent components
- …