848 research outputs found
Fine-Grained Analysis of Stability and Generalization for Stochastic Gradient Descent
Recently there are a considerable amount of work devoted to the study of the
algorithmic stability and generalization for stochastic gradient descent (SGD).
However, the existing stability analysis requires to impose restrictive
assumptions on the boundedness of gradients, strong smoothness and convexity of
loss functions. In this paper, we provide a fine-grained analysis of stability
and generalization for SGD by substantially relaxing these assumptions.
Firstly, we establish stability and generalization for SGD by removing the
existing bounded gradient assumptions. The key idea is the introduction of a
new stability measure called on-average model stability, for which we develop
novel bounds controlled by the risks of SGD iterates. This yields
generalization bounds depending on the behavior of the best model, and leads to
the first-ever-known fast bounds in the low-noise setting using stability
approach. Secondly, the smoothness assumption is relaxed by considering loss
functions with Holder continuous (sub)gradients for which we show that optimal
bounds are still achieved by balancing computation and stability. To our best
knowledge, this gives the first-ever-known stability and generalization bounds
for SGD with even non-differentiable loss functions. Finally, we study learning
problems with (strongly) convex objectives but non-convex loss functions.Comment: to appear in ICML 202
Stability and Generalization Analysis of Gradient Methods for Shallow Neural Networks
While significant theoretical progress has been achieved, unveiling the
generalization mystery of overparameterized neural networks still remains
largely elusive. In this paper, we study the generalization behavior of shallow
neural networks (SNNs) by leveraging the concept of algorithmic stability. We
consider gradient descent (GD) and stochastic gradient descent (SGD) to train
SNNs, for both of which we develop consistent excess risk bounds by balancing
the optimization and generalization via early-stopping. As compared to existing
analysis on GD, our new analysis requires a relaxed overparameterization
assumption and also applies to SGD. The key for the improvement is a better
estimation of the smallest eigenvalues of the Hessian matrices of the empirical
risks and the loss function along the trajectories of GD and SGD by providing a
refined estimation of their iterates.Comment: to appear in Neural Information Processing Systems (NeurIPS 2022
ADAPTIVE TRANSMISSION POWER IN LOW-POWER AND LOSSY NETWORK
Techniques are provided herein for intelligent transmission power control under different transmission patterns in a connected grid mesh. The transmission patterns include asynchronized transmission, broadcast transmission, and unicast transmission. They also provide a mechanism to help data packets compete against interference on specific channels and help high priority Quality of Service (QoS) packet have a greater chance to be received when congestion occurs. This enables the connected grid mesh to achieve higher reliability of communication with efficient power consumption
Emergent Communication in Interactive Sketch Question Answering
Vision-based emergent communication (EC) aims to learn to communicate through
sketches and demystify the evolution of human communication. Ironically,
previous works neglect multi-round interaction, which is indispensable in human
communication. To fill this gap, we first introduce a novel Interactive Sketch
Question Answering (ISQA) task, where two collaborative players are interacting
through sketches to answer a question about an image in a multi-round manner.
To accomplish this task, we design a new and efficient interactive EC system,
which can achieve an effective balance among three evaluation factors,
including the question answering accuracy, drawing complexity and human
interpretability. Our experimental results including human evaluation
demonstrate that multi-round interactive mechanism facilitates targeted and
efficient communication between intelligent agents with decent human
interpretability.Comment: Accepted by NeurIPS 202
- …