Search CORE

848 research outputs found

Stochastic proximal AUC maximization

Author: Lei Yunwen
Ying Yiming
Publication venue
Publication date: 28/02/2021
Field of study

University of Birmingham Research Portal

Fine-Grained Analysis of Stability and Generalization for Stochastic Gradient Descent

Author: Lei Yunwen
Ying Yiming
Publication venue
Publication date: 01/01/2020
Field of study

Recently there are a considerable amount of work devoted to the study of the algorithmic stability and generalization for stochastic gradient descent (SGD). However, the existing stability analysis requires to impose restrictive assumptions on the boundedness of gradients, strong smoothness and convexity of loss functions. In this paper, we provide a fine-grained analysis of stability and generalization for SGD by substantially relaxing these assumptions. Firstly, we establish stability and generalization for SGD by removing the existing bounded gradient assumptions. The key idea is the introduction of a new stability measure called on-average model stability, for which we develop novel bounds controlled by the risks of SGD iterates. This yields generalization bounds depending on the behavior of the best model, and leads to the first-ever-known fast bounds in the low-noise setting using stability approach. Secondly, the smoothness assumption is relaxed by considering loss functions with Holder continuous (sub)gradients for which we show that optimal bounds are still achieved by balancing computation and stability. To our best knowledge, this gives the first-ever-known stability and generalization bounds for SGD with even non-differentiable loss functions. Finally, we study learning problems with (strongly) convex objectives but non-convex loss functions.Comment: to appear in ICML 202

arXiv.org e-Print Archive

University of Birmingham Research Portal

Stability and Generalization Analysis of Gradient Methods for Shallow Neural Networks

Author: Jin Rong
Lei Yunwen
Ying Yiming
Publication venue
Publication date: 19/09/2022
Field of study

While significant theoretical progress has been achieved, unveiling the generalization mystery of overparameterized neural networks still remains largely elusive. In this paper, we study the generalization behavior of shallow neural networks (SNNs) by leveraging the concept of algorithmic stability. We consider gradient descent (GD) and stochastic gradient descent (SGD) to train SNNs, for both of which we develop consistent excess risk bounds by balancing the optimization and generalization via early-stopping. As compared to existing analysis on GD, our new analysis requires a relaxed overparameterization assumption and also applies to SGD. The key for the improvement is a better estimation of the smallest eigenvalues of the Hessian matrices of the empirical risks and the loss function along the trajectories of GD and SGD by providing a refined estimation of their iterates.Comment: to appear in Neural Information Processing Systems (NeurIPS 2022

arXiv.org e-Print Archive

ADAPTIVE TRANSMISSION POWER IN LOW-POWER AND LOSSY NETWORK

Author: Chen Yiming
Fang Xiang
Xie Mingyu
Zhao Lei
Publication venue: Technical Disclosure Commons
Publication date: 25/09/2018
Field of study

Techniques are provided herein for intelligent transmission power control under different transmission patterns in a connected grid mesh. The transmission patterns include asynchronized transmission, broadcast transmission, and unicast transmission. They also provide a mechanism to help data packets compete against interference on specific channels and help high priority Quality of Service (QoS) packet have a greater chance to be received when congestion occurs. This enables the connected grid mesh to achieve higher reliability of communication with efficient power consumption

Technical Disclosure Common

Emergent Communication in Interactive Sketch Question Answering

Author: Chen Siheng
Lei Zixing
Xiong Yuxin
Zhang Yiming
Publication venue
Publication date: 24/10/2023
Field of study

Vision-based emergent communication (EC) aims to learn to communicate through sketches and demystify the evolution of human communication. Ironically, previous works neglect multi-round interaction, which is indispensable in human communication. To fill this gap, we first introduce a novel Interactive Sketch Question Answering (ISQA) task, where two collaborative players are interacting through sketches to answer a question about an image in a multi-round manner. To accomplish this task, we design a new and efficient interactive EC system, which can achieve an effective balance among three evaluation factors, including the question answering accuracy, drawing complexity and human interpretability. Our experimental results including human evaluation demonstrate that multi-round interactive mechanism facilitates targeted and efficient communication between intelligent agents with decent human interpretability.Comment: Accepted by NeurIPS 202

arXiv.org e-Print Archive

A Reinforced Improved Attention Model for Abstractive Text Summarization

Author: Chang Yu
Huang Yiming
Lei Hang
Li Xiaoyu
Publication venue: Waseda Institute for the Study of Language and Information
Publication date: 01/01/2019
Field of study

Waseda University Repository

Stability and differential privacy of stochastic gradient descent for pairwise learning with non-smooth loss

Author: Lei Yunwen
Lyu Siwei
Yang Zhenhuan
Ying Yiming
Publication venue
Publication date: 15/04/2021
Field of study

University of Birmingham Research Portal