5 research outputs found
When Personalization Harms: Reconsidering the Use of Group Attributes in Prediction
Machine learning models are often personalized with categorical attributes
that are protected, sensitive, self-reported, or costly to acquire. In this
work, we show models that are personalized with group attributes can reduce
performance at a group level. We propose formal conditions to ensure the "fair
use" of group attributes in prediction tasks by training one additional model
-- i.e., collective preference guarantees to ensure that each group who
provides personal data will receive a tailored gain in performance in return.
We present sufficient conditions to ensure fair use in empirical risk
minimization and characterize failure modes that lead to fair use violations
due to standard practices in model development and deployment. We present a
comprehensive empirical study of fair use in clinical prediction tasks. Our
results demonstrate the prevalence of fair use violations in practice and
illustrate simple interventions to mitigate their harm.Comment: ICML 2023 Ora
One-shot Empirical Privacy Estimation for Federated Learning
Privacy estimation techniques for differentially private (DP) algorithms are
useful for comparing against analytical bounds, or to empirically measure
privacy loss in settings where known analytical bounds are not tight. However,
existing privacy auditing techniques usually make strong assumptions on the
adversary (e.g., knowledge of intermediate model iterates or the training data
distribution), are tailored to specific tasks and model architectures, and
require retraining the model many times (typically on the order of thousands).
These shortcomings make deploying such techniques at scale difficult in
practice, especially in federated settings where model training can take days
or weeks. In this work, we present a novel "one-shot" approach that can
systematically address these challenges, allowing efficient auditing or
estimation of the privacy loss of a model during the same, single training run
used to fit model parameters, and without requiring any a priori knowledge
about the model architecture or task. We show that our method provides provably
correct estimates for privacy loss under the Gaussian mechanism, and we
demonstrate its performance on a well-established FL benchmark dataset under
several adversarial models
Private Multi-Winner Voting for Machine Learning
Private multi-winner voting is the task of revealing -hot binary vectors
satisfying a bounded differential privacy (DP) guarantee. This task has been
understudied in machine learning literature despite its prevalence in many
domains such as healthcare. We propose three new DP multi-winner mechanisms:
Binary, , and Powerset voting. Binary voting operates independently per
label through composition. voting bounds votes optimally in their
norm for tight data-independent guarantees. Powerset voting operates
over the entire binary vector by viewing the possible outcomes as a power set.
Our theoretical and empirical analysis shows that Binary voting can be a
competitive mechanism on many tasks unless there are strong correlations
between labels, in which case Powerset voting outperforms it. We use our
mechanisms to enable privacy-preserving multi-label learning in the central
setting by extending the canonical single-label technique: PATE. We find that
our techniques outperform current state-of-the-art approaches on large,
real-world healthcare data and standard multi-label benchmarks. We further
enable multi-label confidential and private collaborative (CaPC) learning and
show that model performance can be significantly improved in the multi-site
setting.Comment: Accepted at PoPETS 202
Algorithmic Pluralism: A Structural Approach Towards Equal Opportunity
The idea of equal opportunity enjoys wide acceptance because of the freedom
opportunities provide us to shape our lives. Many disagree deeply, however,
about the meaning of equal opportunity, especially in algorithmic
decision-making. A new theory of equal opportunity adopts a structural
approach, describing how decisions can operate as bottlenecks or narrow places
in the structure of opportunities. This viewpoint on discrimination highlights
fundamental problems with equal opportunity and its achievement through formal
fairness interventions, and instead advocates for a more pluralistic approach
that prioritizes opening up more opportunities for more people. We extend this
theory of bottlenecks to data-driven decision-making, adapting it to center
concerns about the extent to which algorithms can create severe bottlenecks in
the opportunity structure. We recommend algorithmic pluralism: the
prioritization of alleviating severity in systems of algorithmic
decision-making. Drawing on examples from education, healthcare, and criminal
justice, we show how this structural approach helps reframe debates about equal
opportunity in system design and regulation, and how algorithmic pluralism
could help expand opportunities in a more positive-sum way