114 research outputs found
Rayleigh-Gauss-Newton optimization with enhanced sampling for variational Monte Carlo
Variational Monte Carlo (VMC) is an approach for computing ground-state
wavefunctions that has recently become more powerful due to the introduction of
neural network-based wavefunction parametrizations. However, efficiently
training neural wavefunctions to converge to an energy minimum remains a
difficult problem. In this work, we analyze optimization and sampling methods
used in VMC and introduce alterations to improve their performance. First,
based on theoretical convergence analysis in a noiseless setting, we motivate a
new optimizer that we call the Rayleigh-Gauss-Newton method, which can improve
upon gradient descent and natural gradient descent to achieve superlinear
convergence with little added computational cost. Second, in order to realize
this favorable comparison in the presence of stochastic noise, we analyze the
effect of sampling error on VMC parameter updates and experimentally
demonstrate that it can be reduced by the parallel tempering method. In
particular, we demonstrate that RGN can be made robust to energy spikes that
occur when new regions of configuration space become available to the sampler
over the course of optimization. Finally, putting theory into practice, we
apply our enhanced optimization and sampling methods to the transverse-field
Ising and XXZ models on large lattices, yielding ground-state energy estimates
with remarkably high accuracy after just 200-500 parameter updates.Comment: 12 pages, 7 figure
Using Automatic Differentiation as a General Framework for Ptychographic Reconstruction
Coherent diffraction imaging methods enable imaging beyond lens-imposed
resolution limits. In these methods, the object can be recovered by minimizing
an error metric that quantifies the difference between diffraction patterns as
observed, and those calculated from a present guess of the object. Efficient
minimization methods require analytical calculation of the derivatives of the
error metric, which is not always straightforward. This limits our ability to
explore variations of basic imaging approaches. In this paper, we propose to
substitute analytical derivative expressions with the automatic differentiation
method, whereby we can achieve object reconstruction by specifying only the
physics-based experimental forward model. We demonstrate the generality of the
proposed method through straightforward object reconstruction for a variety of
complex ptychographic experimental models.Comment: 23 pages (including references and supplemental material), 19
externally generated figure file
Hadamard Wirtinger Flow for Sparse Phase Retrieval
We consider the problem of reconstructing an -dimensional -sparse
signal from a set of noiseless magnitude-only measurements. Formulating the
problem as an unregularized empirical risk minimization task, we study the
sample complexity performance of gradient descent with Hadamard
parametrization, which we call Hadamard Wirtinger flow (HWF). Provided
knowledge of the signal sparsity , we prove that a single step of HWF is
able to recover the support from (modulo logarithmic term)
samples, where is the largest component of the signal in magnitude.
This support recovery procedure can be used to initialize existing
reconstruction methods and yields algorithms with total runtime proportional to
the cost of reading the data and improved sample complexity, which is linear in
when the signal contains at least one large component. We numerically
investigate the performance of HWF at convergence and show that, while not
requiring any explicit form of regularization nor knowledge of , HWF adapts
to the signal sparsity and reconstructs sparse signals with fewer measurements
than existing gradient based methods
Implicit Regularization in Nonconvex Statistical Estimation: Gradient Descent Converges Linearly for Phase Retrieval, Matrix Completion, and Blind Deconvolution
Recent years have seen a flurry of activities in designing provably efficient
nonconvex procedures for solving statistical estimation problems. Due to the
highly nonconvex nature of the empirical loss, state-of-the-art procedures
often require proper regularization (e.g. trimming, regularized cost,
projection) in order to guarantee fast convergence. For vanilla procedures such
as gradient descent, however, prior theory either recommends highly
conservative learning rates to avoid overshooting, or completely lacks
performance guarantees.
This paper uncovers a striking phenomenon in nonconvex optimization: even in
the absence of explicit regularization, gradient descent enforces proper
regularization implicitly under various statistical models. In fact, gradient
descent follows a trajectory staying within a basin that enjoys nice geometry,
consisting of points incoherent with the sampling mechanism. This "implicit
regularization" feature allows gradient descent to proceed in a far more
aggressive fashion without overshooting, which in turn results in substantial
computational savings. Focusing on three fundamental statistical estimation
problems, i.e. phase retrieval, low-rank matrix completion, and blind
deconvolution, we establish that gradient descent achieves near-optimal
statistical and computational guarantees without explicit regularization. In
particular, by marrying statistical modeling with generic optimization theory,
we develop a general recipe for analyzing the trajectories of iterative
algorithms via a leave-one-out perturbation argument. As a byproduct, for noisy
matrix completion, we demonstrate that gradient descent achieves near-optimal
error control --- measured entrywise and by the spectral norm --- which might
be of independent interest.Comment: accepted to Foundations of Computational Mathematics (FOCM
- …