Search CORE

11,214 research outputs found

Statistical inference using SGD

Author: Caramanis Constantine
Kyrillidis Anastasios
Li Tianyang
Liu Liu
Publication venue
Publication date: 19/11/2017
Field of study

We present a novel method for frequentist statistical inference in

M

-estimation problems, based on stochastic gradient descent (SGD) with a fixed step size: we demonstrate that the average of such SGD sequences can be used for statistical inference, after proper scaling. An intuitive analysis using the Ornstein-Uhlenbeck process suggests that such averages are asymptotically normal. From a practical perspective, our SGD-based inference procedure is a first order method, and is well-suited for large scale problems. To show its merits, we apply it to both synthetic and real datasets, and demonstrate that its accuracy is comparable to classical statistical methods, while requiring potentially far less computation.Comment: To appear in AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Online Bootstrap Inference with Nonconvex Stochastic Gradient Descent Estimator

Author: Kuffner Todd
Lahiri Soumendra
Zhong Yanjie
Publication venue
Publication date: 03/06/2023
Field of study

In this paper, we investigate the theoretical properties of stochastic gradient descent (SGD) for statistical inference in the context of nonconvex optimization problems, which have been relatively unexplored compared to convex settings. Our study is the first to establish provable inferential procedures using the SGD estimator for general nonconvex objective functions, which may contain multiple local minima. We propose two novel online inferential procedures that combine SGD and the multiplier bootstrap technique. The first procedure employs a consistent covariance matrix estimator, and we establish its error convergence rate. The second procedure approximates the limit distribution using bootstrap SGD estimators, yielding asymptotically valid bootstrap confidence intervals. We validate the effectiveness of both approaches through numerical experiments. Furthermore, our analysis yields an intermediate result: the in-expectation error convergence rate for the original SGD estimator in nonconvex settings, which is comparable to existing results for convex problems. We believe this novel finding holds independent interest and enriches the literature on optimization and statistical inference

arXiv.org e-Print Archive

Statistical Inference with Stochastic Gradient Methods under $\phi$ -mixing Data

Author: Chen Xi
Liu Ruiqi
Shang Zuofeng
Publication venue
Publication date: 28/03/2023
Field of study

Stochastic gradient descent (SGD) is a scalable and memory-efficient optimization algorithm for large datasets and stream data, which has drawn a great deal of attention and popularity. The applications of SGD-based estimators to statistical inference such as interval estimation have also achieved great success. However, most of the related works are based on i.i.d. observations or Markov chains. When the observations come from a mixing time series, how to conduct valid statistical inference remains unexplored. As a matter of fact, the general correlation among observations imposes a challenge on interval estimation. Most existing methods may ignore this correlation and lead to invalid confidence intervals. In this paper, we propose a mini-batch SGD estimator for statistical inference when the data is

\phi

-mixing. The confidence intervals are constructed using an associated mini-batch bootstrap SGD procedure. Using ``independent block'' trick from \cite{yu1994rates}, we show that the proposed estimator is asymptotically normal, and its limiting distribution can be effectively approximated by the bootstrap procedure. The proposed method is memory-efficient and easy to implement in practice. Simulation studies on synthetic data and an application to a real-world dataset confirm our theory

arXiv.org e-Print Archive

Recommended from our members

Stochastic gradients methods for statistical inference

Author: Li Tianyang, Ph. D.
Publication venue
Publication date: 10/07/2019
Field of study

Statistical inference, such as hypothesis testing and calculating a confidence interval, is an important tool for accessing uncertainty in machine learning and statistical problems. Stochastic gradient methods, such as stochastic gradient descent (SGD), have recently been successfully applied to point estimation in large scale machine learning problems. In this work, we present novel stochastic gradient methods for statistical inference in large scale machine learning problems. Unregularized M -estimation using SGD. Using SGD with a fixed step size, we demonstrate that the average of such SGD sequences can be used for statistical inference, after proper scaling. An intuitive analysis using the Ornstein-Uhlenbeck process suggests that such averages are asymptotically normal. From a practical perspective, our SGD-based inference procedure is a first order method, and is well-suited for large scale problems. To show its merits, we apply it to both synthetic and real datasets, and demonstrate that its accuracy is comparable to classical statistical methods, while requiring potentially far less computation. Approximate Newton-based statistical inference using only stochastic gradients for unregularized M -estimation. We present a novel inference framework for convex empirical risk minimization, using approximate stochastic Newton steps. The proposed algorithm is based on the notion of finite differences and allows the approximation of a Hessian-vector product from first-order information. In theory, our method efficiently computes the statistical error covariance in M -estimation for unregularized convex learning problems, without using exact second order information, or resampling the entire data set. In practice, we demonstrate the effectiveness of our framework on large-scale machine learning problems, that go even beyond convexity: as a highlight, our work can be used to detect certain adversarial attacks on neural networks. High dimensional linear regression statistical inference using only stochastic gra- dients. As an extension of the approximate Newton-based statistical inference algorithm for unregularized problems, we present a similar algorithm, using only stochastic gradients, for statistical inference in high dimensional linear regression, where the number of features is much larger than the number of samples. Stochastic gradient methods for time series analysis. We present a novel stochastic gradient descent algorithm for time series analysis, which correctly captures correlation structures in a time series dataset during optimization. Instead of uniformly sampling indices in vanilla SGD, we uniformly sample contiguous blocks of indices, where the block length depends on the dataset.Computer Science

Texas ScholarWorks