464 research outputs found
Generalization in Deep Learning
This paper provides theoretical insights into why and how deep learning can
generalize well, despite its large capacity, complexity, possible algorithmic
instability, nonrobustness, and sharp minima, responding to an open question in
the literature. We also discuss approaches to provide non-vacuous
generalization guarantees for deep learning. Based on theoretical observations,
we propose new open problems and discuss the limitations of our results.Comment: To appear in Mathematics of Deep Learning, Cambridge University
Press. All previous results remain unchange
Optimal and Fair Encouragement Policy Evaluation and Learning
In consequential domains, it is often impossible to compel individuals to
take treatment, so that optimal policy rules are merely suggestions in the
presence of human non-adherence to treatment recommendations. In these same
domains, there may be heterogeneity both in who responds in taking-up
treatment, and heterogeneity in treatment efficacy. While optimal treatment
rules can maximize causal outcomes across the population, access parity
constraints or other fairness considerations can be relevant in the case of
encouragement. For example, in social services, a persistent puzzle is the gap
in take-up of beneficial services among those who may benefit from them the
most. When in addition the decision-maker has distributional preferences over
both access and average outcomes, the optimal decision rule changes. We study
causal identification, statistical variance-reduced estimation, and robust
estimation of optimal treatment rules, including under potential violations of
positivity. We consider fairness constraints such as demographic parity in
treatment take-up, and other constraints, via constrained optimization. Our
framework can be extended to handle algorithmic recommendations under an
often-reasonable covariate-conditional exclusion restriction, using our
robustness checks for lack of positivity in the recommendation. We develop a
two-stage algorithm for solving over parametrized policy classes under general
constraints to obtain variance-sensitive regret bounds. We illustrate the
methods in two case studies based on data from randomized encouragement to
enroll in insurance and from pretrial supervised release with electronic
monitoring
The Paradox of Noise: An Empirical Study of Noise-Infusion Mechanisms to Improve Generalization, Stability, and Privacy in Federated Learning
In a data-centric era, concerns regarding privacy and ethical data handling
grow as machine learning relies more on personal information. This empirical
study investigates the privacy, generalization, and stability of deep learning
models in the presence of additive noise in federated learning frameworks. Our
main objective is to provide strategies to measure the generalization,
stability, and privacy-preserving capabilities of these models and further
improve them. To this end, five noise infusion mechanisms at varying noise
levels within centralized and federated learning settings are explored. As
model complexity is a key component of the generalization and stability of deep
learning models during training and evaluation, a comparative analysis of three
Convolutional Neural Network (CNN) architectures is provided. The paper
introduces Signal-to-Noise Ratio (SNR) as a quantitative measure of the
trade-off between privacy and training accuracy of noise-infused models, aiming
to find the noise level that yields optimal privacy and accuracy. Moreover, the
Price of Stability and Price of Anarchy are defined in the context of
privacy-preserving deep learning, contributing to the systematic investigation
of the noise infusion strategies to enhance privacy without compromising
performance. Our research sheds light on the delicate balance between these
critical factors, fostering a deeper understanding of the implications of
noise-based regularization in machine learning. By leveraging noise as a tool
for regularization and privacy enhancement, we aim to contribute to the
development of robust, privacy-aware algorithms, ensuring that AI-driven
solutions prioritize both utility and privacy
Generalization in Deep Learning
With a direct analysis of neural networks, this paper presents a mathematically tight generalization theory to partially address an open problem regarding the generalization of deep learning. Unlike previous bound-based theory, our main theory is quantitatively as tight as possible for every dataset individually, while producing qualitative insights competitively. Our results give insight into why and how deep learning can generalize well, despite its large capacity, complexity, possible algorithmic instability, nonrobustness, and sharp minima, answering to an open question in the literature. We also discuss limitations of our results and propose additional open problems
- …