1,354 research outputs found
Online Bootstrap Inference with Nonconvex Stochastic Gradient Descent Estimator
In this paper, we investigate the theoretical properties of stochastic
gradient descent (SGD) for statistical inference in the context of nonconvex
optimization problems, which have been relatively unexplored compared to convex
settings. Our study is the first to establish provable inferential procedures
using the SGD estimator for general nonconvex objective functions, which may
contain multiple local minima.
We propose two novel online inferential procedures that combine SGD and the
multiplier bootstrap technique. The first procedure employs a consistent
covariance matrix estimator, and we establish its error convergence rate. The
second procedure approximates the limit distribution using bootstrap SGD
estimators, yielding asymptotically valid bootstrap confidence intervals. We
validate the effectiveness of both approaches through numerical experiments.
Furthermore, our analysis yields an intermediate result: the in-expectation
error convergence rate for the original SGD estimator in nonconvex settings,
which is comparable to existing results for convex problems. We believe this
novel finding holds independent interest and enriches the literature on
optimization and statistical inference
Statistical Inference with Stochastic Gradient Methods under -mixing Data
Stochastic gradient descent (SGD) is a scalable and memory-efficient
optimization algorithm for large datasets and stream data, which has drawn a
great deal of attention and popularity. The applications of SGD-based
estimators to statistical inference such as interval estimation have also
achieved great success. However, most of the related works are based on i.i.d.
observations or Markov chains. When the observations come from a mixing time
series, how to conduct valid statistical inference remains unexplored. As a
matter of fact, the general correlation among observations imposes a challenge
on interval estimation. Most existing methods may ignore this correlation and
lead to invalid confidence intervals. In this paper, we propose a mini-batch
SGD estimator for statistical inference when the data is -mixing. The
confidence intervals are constructed using an associated mini-batch bootstrap
SGD procedure. Using ``independent block'' trick from \cite{yu1994rates}, we
show that the proposed estimator is asymptotically normal, and its limiting
distribution can be effectively approximated by the bootstrap procedure. The
proposed method is memory-efficient and easy to implement in practice.
Simulation studies on synthetic data and an application to a real-world dataset
confirm our theory
High Confidence Level Inference is Almost Free using Parallel Stochastic Optimization
Uncertainty quantification for estimation through stochastic optimization
solutions in an online setting has gained popularity recently. This paper
introduces a novel inference method focused on constructing confidence
intervals with efficient computation and fast convergence to the nominal level.
Specifically, we propose to use a small number of independent multi-runs to
acquire distribution information and construct a t-based confidence interval.
Our method requires minimal additional computation and memory beyond the
standard updating of estimates, making the inference process almost cost-free.
We provide a rigorous theoretical guarantee for the confidence interval,
demonstrating that the coverage is approximately exact with an explicit
convergence rate and allowing for high confidence level inference. In
particular, a new Gaussian approximation result is developed for the online
estimators to characterize the coverage properties of our confidence intervals
in terms of relative errors. Additionally, our method also allows for
leveraging parallel computing to further accelerate calculations using multiple
cores. It is easy to implement and can be integrated with existing stochastic
algorithms without the need for complicated modifications
On the Inversion of High Energy Proton
Inversion of the K-fold stochastic autoconvolution integral equation is an
elementary nonlinear problem, yet there are no de facto methods to solve it
with finite statistics. To fix this problem, we introduce a novel inverse
algorithm based on a combination of minimization of relative entropy, the Fast
Fourier Transform and a recursive version of Efron's bootstrap. This gives us
power to obtain new perspectives on non-perturbative high energy QCD, such as
probing the ab initio principles underlying the approximately negative binomial
distributions of observed charged particle final state multiplicities, related
to multiparton interactions, the fluctuating structure and profile of proton
and diffraction. As a proof-of-concept, we apply the algorithm to ALICE
proton-proton charged particle multiplicity measurements done at different
center-of-mass energies and fiducial pseudorapidity intervals at the LHC,
available on HEPData. A strong double peak structure emerges from the
inversion, barely visible without it.Comment: 29 pages, 10 figures, v2: extended analysis (re-projection ratios,
2D
A Workflow for Statistical Inference in Stochastic Gradient Descent
Stochastic gradient descent (SGD) is an estimation tool for large data
employed in machine learning and statistics. Due to the Markovian nature of the
SGD process, inference is a challenging problem. An underlying asymptotic
normality of the averaged SGD (ASGD) estimator allows for the construction of a
batch-means estimator of the asymptotic covariance matrix. Instead of the usual
increasing batch-size strategy employed in ASGD, we propose a memory efficient
equal batch-size strategy and show that under mild conditions, the estimator is
consistent. A key feature of the proposed batching technique is that it allows
for bias-correction of the variance, at no cost to memory. Since joint
inference for high dimensional problems may be undesirable, we present
marginal-friendly simultaneous confidence intervals, and show through an
example how covariance estimators of ASGD can be employed in improved
predictions.Comment: 43 pages, 8 figure
Fast and Robust Online Inference with Stochastic Gradient Descent via Random Scaling
We develop a new method of online inference for a vector of parameters
estimated by the Polyak-Ruppert averaging procedure of stochastic gradient
descent (SGD) algorithms. We leverage insights from time series regression in
econometrics and construct asymptotically pivotal statistics via random
scaling. Our approach is fully operational with online data and is rigorously
underpinned by a functional central limit theorem. Our proposed inference
method has a couple of key advantages over the existing methods. First, the
test statistic is computed in an online fashion with only SGD iterates and the
critical values can be obtained without any resampling methods, thereby
allowing for efficient implementation suitable for massive online data. Second,
there is no need to estimate the asymptotic variance and our inference method
is shown to be robust to changes in the tuning parameters for SGD algorithms in
simulation experiments with synthetic data.Comment: 16 pages, 5 figures, 5 table
- …