339 research outputs found

    Achieving Differential Privacy and Fairness in Machine Learning

    Get PDF
    Machine learning algorithms are used to make decisions in various applications, such as recruiting, lending and policing. These algorithms rely on large amounts of sensitive individual information to work properly. Hence, there are sociological concerns about machine learning algorithms on matters like privacy and fairness. Currently, many studies only focus on protecting individual privacy or ensuring fairness of algorithms separately without taking consideration of their connection. However, there are new challenges arising in privacy preserving and fairness-aware machine learning. On one hand, there is fairness within the private model, i.e., how to meet both privacy and fairness requirements simultaneously in machine learning algorithms. On the other hand, there is fairness between the private model and the non-private model, i.e., how to ensure the utility loss due to differential privacy is the same towards each group. The goal of this dissertation is to address challenging issues in privacy preserving and fairness-aware machine learning: achieving differential privacy with satisfactory utility and efficiency in complex and emerging tasks, using generative models to generate fair data and to assist fair classification, achieving both differential privacy and fairness simultaneously within the same model, and achieving equal utility loss w.r.t. each group between the private model and the non-private model. In this dissertation, we develop the following algorithms to address the above challenges. (1) We develop PrivPC and DPNE algorithms to achieve differential privacy in complex and emerging tasks of causal graph discovery and network embedding, respectively. (2) We develop the fair generative adversarial neural networks framework and three algorithms (FairGAN, FairGAN+ and CFGAN) to achieve fair data generation and classification through generative models based on different association-based and causation-based fairness notions. (3) We develop PFLR and PFLR* algorithms to simultaneously achieve both differential privacy and fairness in logistic regression. (4) We develop a DPSGD-F algorithm to remove the disparate impact of differential privacy on model accuracy w.r.t. each group

    FairDP: Certified Fairness with Differential Privacy

    Full text link
    This paper introduces FairDP, a novel mechanism designed to achieve certified fairness with differential privacy (DP). FairDP independently trains models for distinct individual groups, using group-specific clipping terms to assess and bound the disparate impacts of DP. Throughout the training process, the mechanism progressively integrates knowledge from group models to formulate a comprehensive model that balances privacy, utility, and fairness in downstream tasks. Extensive theoretical and empirical analyses validate the efficacy of FairDP and improved trade-offs between model utility, privacy, and fairness compared with existing methods

    Stochastic Differentially Private and Fair Learning

    Full text link
    Machine learning models are increasingly used in high-stakes decision-making systems. In such applications, a major concern is that these models sometimes discriminate against certain demographic groups such as individuals with certain race, gender, or age. Another major concern in these applications is the violation of the privacy of users. While fair learning algorithms have been developed to mitigate discrimination issues, these algorithms can still leak sensitive information, such as individuals' health or financial records. Utilizing the notion of differential privacy (DP), prior works aimed at developing learning algorithms that are both private and fair. However, existing algorithms for DP fair learning are either not guaranteed to converge or require full batch of data in each iteration of the algorithm to converge. In this paper, we provide the first stochastic differentially private algorithm for fair learning that is guaranteed to converge. Here, the term "stochastic" refers to the fact that our proposed algorithm converges even when minibatches of data are used at each iteration (i.e. stochastic optimization). Our framework is flexible enough to permit different fairness notions, including demographic parity and equalized odds. In addition, our algorithm can be applied to non-binary classification tasks with multiple (non-binary) sensitive attributes. As a byproduct of our convergence analysis, we provide the first utility guarantee for a DP algorithm for solving nonconvex-strongly concave min-max problems. Our numerical experiments show that the proposed algorithm consistently offers significant performance gains over the state-of-the-art baselines, and can be applied to larger scale problems with non-binary target/sensitive attributes.Comment: ICLR 202

    Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent

    Full text link
    Differentially private stochastic gradient descent (DP-SGD) is the workhorse algorithm for recent advances in private deep learning. It provides a single privacy guarantee to all datapoints in the dataset. We propose output-specific (ε,δ)(\varepsilon,\delta)-DP to characterize privacy guarantees for individual examples when releasing models trained by DP-SGD. We also design an efficient algorithm to investigate individual privacy across a number of datasets. We find that most examples enjoy stronger privacy guarantees than the worst-case bound. We further discover that the training loss and the privacy parameter of an example are well-correlated. This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees. For example, on CIFAR-10, the average ε\varepsilon of the class with the lowest test accuracy is 44.2\% higher than that of the class with the highest accuracy.Comment: Published in Transactions on Machine Learning Research (TMLR

    Directional Privacy for Deep Learning

    Full text link
    Differentially Private Stochastic Gradient Descent (DP-SGD) is a key method for applying privacy in the training of deep learning models. This applies isotropic Gaussian noise to gradients during training, which can perturb these gradients in any direction, damaging utility. Metric DP, however, can provide alternative mechanisms based on arbitrary metrics that might be more suitable. In this paper we apply \textit{directional privacy}, via a mechanism based on the von Mises-Fisher (VMF) distribution, to perturb gradients in terms of \textit{angular distance} so that gradient direction is broadly preserved. We show that this provides ϵd\epsilon d-privacy for deep learning training, rather than the (ϵ,δ)(\epsilon, \delta)-privacy of the Gaussian mechanism; and that experimentally, on key datasets, the VMF mechanism can outperform the Gaussian in the utility-privacy trade-off

    THE INTERPLAY BETWEEN PRIVACY AND FAIRNESS IN LEARNING AND DECISION MAKING PROBLEMS

    Get PDF
    The availability of large datasets and computational resources has driven significant progress in Artificial Intelligence (AI) and, especially,Machine Learning (ML). These advances have rendered AI systems instrumental for many decision making and policy operations involving individuals: they include assistance in legal decisions, lending, and hiring, as well determinations of resources and benefits, all of which have profound social and economic impacts. While data-driven systems have been successful in an increasing number of tasks, the use of rich datasets, combined with the adoption of black-box algorithms, has sparked concerns about how these systems operate. How much information these systems leak about the individuals whose data is used as input and how they handle biases and fairness issues are two of these critical concerns. While some people argue that privacy and fairness are in alignment, the majority instead believe these are two contrasting metrics. This thesis firstly studies the interaction between privacy and fairness in machine learning and decision problems. It focuses on the scenario when fairness and privacy are at odds and investigates different factors that can explain for such behaviors. It then proposes effective and efficient mitigation solutions to improve fairness under privacy constraints. In the second part, it analyzes the connection between fairness and other machine learning concepts such as model compression and adversarial robustness. Finally, it introduces a novel privacy concept and an initial implementation to protect such proposed users privacy at inference time
    corecore