Search CORE

5 research outputs found

When Personalization Harms: Reconsidering the Use of Group Attributes in Prediction

Author: Ghassemi Marzyeh
Suriyakumar Vinith M.
Ustun Berk
Publication venue
Publication date: 23/07/2023
Field of study

Machine learning models are often personalized with categorical attributes that are protected, sensitive, self-reported, or costly to acquire. In this work, we show models that are personalized with group attributes can reduce performance at a group level. We propose formal conditions to ensure the "fair use" of group attributes in prediction tasks by training one additional model -- i.e., collective preference guarantees to ensure that each group who provides personal data will receive a tailored gain in performance in return. We present sufficient conditions to ensure fair use in empirical risk minimization and characterize failure modes that lead to fair use violations due to standard practices in model development and deployment. We present a comprehensive empirical study of fair use in clinical prediction tasks. Our results demonstrate the prevalence of fair use violations in practice and illustrate simple interventions to mitigate their harm.Comment: ICML 2023 Ora

arXiv.org e-Print Archive

One-shot Empirical Privacy Estimation for Federated Learning

Author: Andrew Galen
Kairouz Peter
McMahan H. Brendan
Oh Sewoong
Oprea Alina
Suriyakumar Vinith
Publication venue
Publication date: 22/05/2023
Field of study

Privacy estimation techniques for differentially private (DP) algorithms are useful for comparing against analytical bounds, or to empirically measure privacy loss in settings where known analytical bounds are not tight. However, existing privacy auditing techniques usually make strong assumptions on the adversary (e.g., knowledge of intermediate model iterates or the training data distribution), are tailored to specific tasks and model architectures, and require retraining the model many times (typically on the order of thousands). These shortcomings make deploying such techniques at scale difficult in practice, especially in federated settings where model training can take days or weeks. In this work, we present a novel "one-shot" approach that can systematically address these challenges, allowing efficient auditing or estimation of the privacy loss of a model during the same, single training run used to fit model parameters, and without requiring any a priori knowledge about the model architecture or task. We show that our method provides provably correct estimates for privacy loss under the Gaussian mechanism, and we demonstrate its performance on a well-established FL benchmark dataset under several adversarial models

arXiv.org e-Print Archive

Private Multi-Winner Voting for Machine Learning

Author: Choquette-Choo Christopher A
Dullerud Natalie
Dziedzic Adam
Jha Somesh
Kaleem Muhammad Ahmad
Papernot Nicolas
Shamsabadi Ali Shahin
Suriyakumar Vinith Menon
Wang Xiao
Publication venue
Publication date: 23/11/2022
Field of study

Private multi-winner voting is the task of revealing

k

-hot binary vectors satisfying a bounded differential privacy (DP) guarantee. This task has been understudied in machine learning literature despite its prevalence in many domains such as healthcare. We propose three new DP multi-winner mechanisms: Binary,

\tau

, and Powerset voting. Binary voting operates independently per label through composition.

\tau

voting bounds votes optimally in their

\ell_2

norm for tight data-independent guarantees. Powerset voting operates over the entire binary vector by viewing the possible outcomes as a power set. Our theoretical and empirical analysis shows that Binary voting can be a competitive mechanism on many tasks unless there are strong correlations between labels, in which case Powerset voting outperforms it. We use our mechanisms to enable privacy-preserving multi-label learning in the central setting by extending the canonical single-label technique: PATE. We find that our techniques outperform current state-of-the-art approaches on large, real-world healthcare data and standard multi-label benchmarks. We further enable multi-label confidential and private collaborative (CaPC) learning and show that model performance can be significantly improved in the multi-site setting.Comment: Accepted at PoPETS 202

arXiv.org e-Print Archive

Algorithmic Pluralism: A Structural Approach Towards Equal Opportunity

Author: Jain Shomik
Suriyakumar Vinith
Wilson Ashia
Publication venue
Publication date: 14/05/2023
Field of study

The idea of equal opportunity enjoys wide acceptance because of the freedom opportunities provide us to shape our lives. Many disagree deeply, however, about the meaning of equal opportunity, especially in algorithmic decision-making. A new theory of equal opportunity adopts a structural approach, describing how decisions can operate as bottlenecks or narrow places in the structure of opportunities. This viewpoint on discrimination highlights fundamental problems with equal opportunity and its achievement through formal fairness interventions, and instead advocates for a more pluralistic approach that prioritizes opening up more opportunities for more people. We extend this theory of bottlenecks to data-driven decision-making, adapting it to center concerns about the extent to which algorithms can create severe bottlenecks in the opportunity structure. We recommend algorithmic pluralism: the prioritization of alleviating severity in systems of algorithmic decision-making. Drawing on examples from education, healthcare, and criminal justice, we show how this structural approach helps reframe debates about equal opportunity in system design and regulation, and how algorithmic pluralism could help expand opportunities in a more positive-sum way

arXiv.org e-Print Archive