Socially Responsible Machine Learning: On the Preservation of Individual Privacy and Fairness

Abstract

Machine learning (ML) techniques have seen significant advances over the last decade and are playing an increasingly critical role in people's lives. While their potential societal benefits are enormous, they can also inflict great harm if not developed or used with care. In this thesis, we focus on two critical ethical issues in ML systems, the violation of privacy and fairness, and explore mitigating approaches in various scenarios. On the privacy front, when ML systems are developed with private data from individuals, it is critical to prevent privacy violation. Differential privacy (DP), a widely used notion of privacy, ensures that no one by observing the computational outcome can infer a particular individual’s data with high confidence. However, DP is typically achieved by randomizing algorithms (e.g., adding noise), which inevitably leads to a trade-off between individual privacy and outcome accuracy. This trade-off can be difficult to balance, especially in settings where the same or correlated data is repeatedly used/exposed during the computation. In the first part of the thesis, we illustrate two key ideas that can be used to balance an algorithm's privacy-accuracy tradeoff: (1) the reuse of intermediate computational results to reduce information leakage; and (2) improving algorithmic robustness to accommodate more randomness. We introduce a number of randomized, privacy-preserving algorithms that leverage these ideas in various contexts such as distributed optimization and sequential computation. It is shown that our algorithms can significantly improve the privacy-accuracy tradeoff over existing solutions. On the fairness front, ML systems trained with real-world data can inherit biases and exhibit discrimination against already-disadvantaged or marginalized social groups. Recent works have proposed many fairness notions to measure and remedy such biases. However, their effectiveness is mostly studied in a static framework without accounting for the interactions between individuals and ML systems. Since individuals inevitably react to the algorithmic decisions they are subjected to, understanding the downstream impacts of ML decisions is critical to ensure that these decisions are socially responsible. In the second part of the thesis, we present our research on evaluating the long-term impacts of (fair) ML decisions. Specifically, we establish a number of theoretically rigorous frameworks to model the interactions and feedback between ML systems and individuals, and conduct equilibrium analysis to evaluate the impact they each have on the other. We will illustrate how ML decisions and individual behavior evolve in such a system, and how imposing common fairness criteria intended to promote fairness may nevertheless lead to undesirable pernicious effects. Aided with such understanding, mitigation approaches are also discussed.PHDElectrical and Computer EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/169960/1/xueru_1.pd

    Similar works