1 research outputs found
Dependent randomized rounding for clustering and partition systems with knapsack constraints
Clustering problems are fundamental to unsupervised learning. There is an
increased emphasis on fairness in machine learning and AI; one representative
notion of fairness is that no single demographic group should be
over-represented among the cluster-centers. This, and much more general
clustering problems, can be formulated with "knapsack" and "partition"
constraints. We develop new randomized algorithms targeting such problems, and
study two in particular: multi-knapsack median and multi-knapsack center. Our
rounding algorithms give new approximation and pseudo-approximation algorithms
for these problems. One key technical tool, which may be of independent
interest, is a new tail bound analogous to Feige (2006) for sums of random
variables with unbounded variances. Such bounds are very useful in inferring
properties of large networks using few samples