40,540 research outputs found

    A Stochastic Conjugate Gradient Method for Approximation of Functions

    Get PDF
    A stochastic conjugate gradient method for approximation of a function is proposed. The proposed method avoids computing and storing the covariance matrix in the normal equations for the least squares solution. In addition, the method performs the conjugate gradient steps by using an inner product that is based stochastic sampling. Theoretical analysis shows that the method is convergent in probability. The method has applications in such fields as predistortion for the linearization of power amplifiers.Comment: 21 pages, 5 figure

    Matrix-free GPU implementation of a preconditioned conjugate gradient solver for anisotropic elliptic PDEs

    Get PDF
    Many problems in geophysical and atmospheric modelling require the fast solution of elliptic partial differential equations (PDEs) in "flat" three dimensional geometries. In particular, an anisotropic elliptic PDE for the pressure correction has to be solved at every time step in the dynamical core of many numerical weather prediction models, and equations of a very similar structure arise in global ocean models, subsurface flow simulations and gas and oil reservoir modelling. The elliptic solve is often the bottleneck of the forecast, and an algorithmically optimal method has to be used and implemented efficiently. Graphics Processing Units have been shown to be highly efficient for a wide range of applications in scientific computing, and recently iterative solvers have been parallelised on these architectures. We describe the GPU implementation and optimisation of a Preconditioned Conjugate Gradient (PCG) algorithm for the solution of a three dimensional anisotropic elliptic PDE for the pressure correction in NWP. Our implementation exploits the strong vertical anisotropy of the elliptic operator in the construction of a suitable preconditioner. As the algorithm is memory bound, performance can be improved significantly by reducing the amount of global memory access. We achieve this by using a matrix-free implementation which does not require explicit storage of the matrix and instead recalculates the local stencil. Global memory access can also be reduced by rewriting the algorithm using loop fusion and we show that this further reduces the runtime on the GPU. We demonstrate the performance of our matrix-free GPU code by comparing it to a sequential CPU implementation and to a matrix-explicit GPU code which uses existing libraries. The absolute performance of the algorithm for different problem sizes is quantified in terms of floating point throughput and global memory bandwidth.Comment: 18 pages, 7 figure

    On limited-memory quasi-Newton methods for minimizing a quadratic function

    Full text link
    The main focus in this paper is exact linesearch methods for minimizing a quadratic function whose Hessian is positive definite. We give two classes of limited-memory quasi-Newton Hessian approximations that generate search directions parallel to those of the method of preconditioned conjugate gradients, and hence give finite termination on quadratic optimization problems. The Hessian approximations are described by a novel compact representation which provides a dynamical framework. We also discuss possible extensions of these classes and show their behavior on randomly generated quadratic optimization problems. The methods behave numerically similar to L-BFGS. Inclusion of information from the first iteration in the limited-memory Hessian approximation and L-BFGS significantly reduces the effects of round-off errors on the considered problems. In addition, we give our compact representation of the Hessian approximations in the full Broyden class for the general unconstrained optimization problem. This representation consists of explicit matrices and gradients only as vector components

    Online optimization of storage ring nonlinear beam dynamics

    Full text link
    We propose to optimize the nonlinear beam dynamics of existing and future storage rings with direct online optimization techniques. This approach may have crucial importance for the implementation of diffraction limited storage rings. In this paper considerations and algorithms for the online optimization approach are discussed. We have applied this approach to experimentally improve the dynamic aperture of the SPEAR3 storage ring with the robust conjugate direction search method and the particle swarm optimization method. The dynamic aperture was improved by more than 5 mm within a short period of time. Experimental setup and results are presented

    Examination of accelerated first order methods for aircraft flight path optimization

    Get PDF
    Accelerated first order methods for aircraft flight path optimizatio

    Solving Large-Scale Optimization Problems Related to Bell's Theorem

    Get PDF
    Impossibility of finding local realistic models for quantum correlations due to entanglement is an important fact in foundations of quantum physics, gaining now new applications in quantum information theory. We present an in-depth description of a method of testing the existence of such models, which involves two levels of optimization: a higher-level non-linear task and a lower-level linear programming (LP) task. The article compares the performances of the existing implementation of the method, where the LPs are solved with the simplex method, and our new implementation, where the LPs are solved with a matrix-free interior point method. We describe in detail how the latter can be applied to our problem, discuss the basic scenario and possible improvements and how they impact on overall performance. Significant performance advantage of the matrix-free interior point method over the simplex method is confirmed by extensive computational results. The new method is able to solve problems which are orders of magnitude larger. Consequently, the noise resistance of the non-classicality of correlations of several types of quantum states, which has never been computed before, can now be efficiently determined. An extensive set of data in the form of tables and graphics is presented and discussed. The article is intended for all audiences, no quantum-mechanical background is necessary.Comment: 19 pages, 7 tables, 1 figur
    • …
    corecore