40,540 research outputs found
A Stochastic Conjugate Gradient Method for Approximation of Functions
A stochastic conjugate gradient method for approximation of a function is
proposed. The proposed method avoids computing and storing the covariance
matrix in the normal equations for the least squares solution. In addition, the
method performs the conjugate gradient steps by using an inner product that is
based stochastic sampling. Theoretical analysis shows that the method is
convergent in probability. The method has applications in such fields as
predistortion for the linearization of power amplifiers.Comment: 21 pages, 5 figure
Matrix-free GPU implementation of a preconditioned conjugate gradient solver for anisotropic elliptic PDEs
Many problems in geophysical and atmospheric modelling require the fast
solution of elliptic partial differential equations (PDEs) in "flat" three
dimensional geometries. In particular, an anisotropic elliptic PDE for the
pressure correction has to be solved at every time step in the dynamical core
of many numerical weather prediction models, and equations of a very similar
structure arise in global ocean models, subsurface flow simulations and gas and
oil reservoir modelling. The elliptic solve is often the bottleneck of the
forecast, and an algorithmically optimal method has to be used and implemented
efficiently. Graphics Processing Units have been shown to be highly efficient
for a wide range of applications in scientific computing, and recently
iterative solvers have been parallelised on these architectures. We describe
the GPU implementation and optimisation of a Preconditioned Conjugate Gradient
(PCG) algorithm for the solution of a three dimensional anisotropic elliptic
PDE for the pressure correction in NWP. Our implementation exploits the strong
vertical anisotropy of the elliptic operator in the construction of a suitable
preconditioner. As the algorithm is memory bound, performance can be improved
significantly by reducing the amount of global memory access. We achieve this
by using a matrix-free implementation which does not require explicit storage
of the matrix and instead recalculates the local stencil. Global memory access
can also be reduced by rewriting the algorithm using loop fusion and we show
that this further reduces the runtime on the GPU. We demonstrate the
performance of our matrix-free GPU code by comparing it to a sequential CPU
implementation and to a matrix-explicit GPU code which uses existing libraries.
The absolute performance of the algorithm for different problem sizes is
quantified in terms of floating point throughput and global memory bandwidth.Comment: 18 pages, 7 figure
On limited-memory quasi-Newton methods for minimizing a quadratic function
The main focus in this paper is exact linesearch methods for minimizing a
quadratic function whose Hessian is positive definite. We give two classes of
limited-memory quasi-Newton Hessian approximations that generate search
directions parallel to those of the method of preconditioned conjugate
gradients, and hence give finite termination on quadratic optimization
problems. The Hessian approximations are described by a novel compact
representation which provides a dynamical framework. We also discuss possible
extensions of these classes and show their behavior on randomly generated
quadratic optimization problems. The methods behave numerically similar to
L-BFGS. Inclusion of information from the first iteration in the limited-memory
Hessian approximation and L-BFGS significantly reduces the effects of round-off
errors on the considered problems. In addition, we give our compact
representation of the Hessian approximations in the full Broyden class for the
general unconstrained optimization problem. This representation consists of
explicit matrices and gradients only as vector components
Online optimization of storage ring nonlinear beam dynamics
We propose to optimize the nonlinear beam dynamics of existing and future
storage rings with direct online optimization techniques. This approach may
have crucial importance for the implementation of diffraction limited storage
rings. In this paper considerations and algorithms for the online optimization
approach are discussed. We have applied this approach to experimentally improve
the dynamic aperture of the SPEAR3 storage ring with the robust conjugate
direction search method and the particle swarm optimization method. The dynamic
aperture was improved by more than 5 mm within a short period of time.
Experimental setup and results are presented
Examination of accelerated first order methods for aircraft flight path optimization
Accelerated first order methods for aircraft flight path optimizatio
Solving Large-Scale Optimization Problems Related to Bell's Theorem
Impossibility of finding local realistic models for quantum correlations due
to entanglement is an important fact in foundations of quantum physics, gaining
now new applications in quantum information theory. We present an in-depth
description of a method of testing the existence of such models, which involves
two levels of optimization: a higher-level non-linear task and a lower-level
linear programming (LP) task. The article compares the performances of the
existing implementation of the method, where the LPs are solved with the
simplex method, and our new implementation, where the LPs are solved with a
matrix-free interior point method. We describe in detail how the latter can be
applied to our problem, discuss the basic scenario and possible improvements
and how they impact on overall performance. Significant performance advantage
of the matrix-free interior point method over the simplex method is confirmed
by extensive computational results. The new method is able to solve problems
which are orders of magnitude larger. Consequently, the noise resistance of the
non-classicality of correlations of several types of quantum states, which has
never been computed before, can now be efficiently determined. An extensive set
of data in the form of tables and graphics is presented and discussed. The
article is intended for all audiences, no quantum-mechanical background is
necessary.Comment: 19 pages, 7 tables, 1 figur
- …