298 research outputs found
Three real-space discretization techniques in electronic structure calculations
A characteristic feature of the state-of-the-art of real-space methods in
electronic structure calculations is the diversity of the techniques used in
the discretization of the relevant partial differential equations. In this
context, the main approaches include finite-difference methods, various types
of finite-elements and wavelets. This paper reports on the results of several
code development projects that approach problems related to the electronic
structure using these three different discretization methods. We review the
ideas behind these methods, give examples of their applications, and discuss
their similarities and differences.Comment: 39 pages, 10 figures, accepted to a special issue of "physica status
solidi (b) - basic solid state physics" devoted to the CECAM workshop "State
of the art developments and perspectives of real-space electronic structure
techniques in condensed matter and molecular physics". v2: Minor stylistic
and typographical changes, partly inspired by referee comment
Numerically Stable Recurrence Relations for the Communication Hiding Pipelined Conjugate Gradient Method
Pipelined Krylov subspace methods (also referred to as communication-hiding
methods) have been proposed in the literature as a scalable alternative to
classic Krylov subspace algorithms for iteratively computing the solution to a
large linear system in parallel. For symmetric and positive definite system
matrices the pipelined Conjugate Gradient method outperforms its classic
Conjugate Gradient counterpart on large scale distributed memory hardware by
overlapping global communication with essential computations like the
matrix-vector product, thus hiding global communication. A well-known drawback
of the pipelining technique is the (possibly significant) loss of numerical
stability. In this work a numerically stable variant of the pipelined Conjugate
Gradient algorithm is presented that avoids the propagation of local rounding
errors in the finite precision recurrence relations that construct the Krylov
subspace basis. The multi-term recurrence relation for the basis vector is
replaced by two-term recurrences, improving stability without increasing the
overall computational cost of the algorithm. The proposed modification ensures
that the pipelined Conjugate Gradient method is able to attain a highly
accurate solution independently of the pipeline length. Numerical experiments
demonstrate a combination of excellent parallel performance and improved
maximal attainable accuracy for the new pipelined Conjugate Gradient algorithm.
This work thus resolves one of the major practical restrictions for the
useability of pipelined Krylov subspace methods.Comment: 15 pages, 5 figures, 1 table, 2 algorithm
Custom optimization algorithms for efficient hardware implementation
The focus is on real-time optimal decision making with application in advanced control
systems. These computationally intensive schemes, which involve the repeated solution of
(convex) optimization problems within a sampling interval, require more efficient computational
methods than currently available for extending their application to highly dynamical
systems and setups with resource-constrained embedded computing platforms.
A range of techniques are proposed to exploit synergies between digital hardware, numerical
analysis and algorithm design. These techniques build on top of parameterisable
hardware code generation tools that generate VHDL code describing custom computing
architectures for interior-point methods and a range of first-order constrained optimization
methods. Since memory limitations are often important in embedded implementations we
develop a custom storage scheme for KKT matrices arising in interior-point methods for
control, which reduces memory requirements significantly and prevents I/O bandwidth
limitations from affecting the performance in our implementations. To take advantage of
the trend towards parallel computing architectures and to exploit the special characteristics
of our custom architectures we propose several high-level parallel optimal control
schemes that can reduce computation time. A novel optimization formulation was devised
for reducing the computational effort in solving certain problems independent of the computing
platform used. In order to be able to solve optimization problems in fixed-point
arithmetic, which is significantly more resource-efficient than floating-point, tailored linear
algebra algorithms were developed for solving the linear systems that form the computational
bottleneck in many optimization methods. These methods come with guarantees
for reliable operation. We also provide finite-precision error analysis for fixed-point implementations
of first-order methods that can be used to minimize the use of resources while
meeting accuracy specifications. The suggested techniques are demonstrated on several
practical examples, including a hardware-in-the-loop setup for optimization-based control
of a large airliner.Open Acces
Some Preconditioning Techniques for Saddle Point Problems
Saddle point problems arise frequently in many applications in science and engineering, including constrained optimization, mixed finite element formulations of partial differential equations, circuit analysis, and so forth. Indeed the formulation of most problems with constraints gives rise to saddle point systems. This paper provides a concise overview of iterative approaches for the solution of such systems which are of particular importance in the context of large scale computation. In particular we describe some of the most useful preconditioning techniques for Krylov subspace solvers applied to saddle point problems, including block and constrained preconditioners.\ud
\ud
The work of Michele Benzi was supported in part by the National Science Foundation grant DMS-0511336
Computing and deflating eigenvalues while solving multiple right hand side linear systems in Quantum Chromodynamics
We present a new algorithm that computes eigenvalues and eigenvectors of a
Hermitian positive definite matrix while solving a linear system of equations
with Conjugate Gradient (CG). Traditionally, all the CG iteration vectors could
be saved and recombined through the eigenvectors of the tridiagonal projection
matrix, which is equivalent theoretically to unrestarted Lanczos. Our algorithm
capitalizes on the iteration vectors produced by CG to update only a small
window of vectors that approximate the eigenvectors. While this window is
restarted in a locally optimal way, the CG algorithm for the linear system is
unaffected. Yet, in all our experiments, this small window converges to the
required eigenvectors at a rate identical to unrestarted Lanczos. After the
solution of the linear system, eigenvectors that have not accurately converged
can be improved in an incremental fashion by solving additional linear systems.
In this case, eigenvectors identified in earlier systems can be used to
deflate, and thus accelerate, the convergence of subsequent systems. We have
used this algorithm with excellent results in lattice QCD applications, where
hundreds of right hand sides may be needed. Specifically, about 70 eigenvectors
are obtained to full accuracy after solving 24 right hand sides. Deflating
these from the large number of subsequent right hand sides removes the dreaded
critical slowdown, where the conditioning of the matrix increases as the quark
mass reaches a critical value. Our experiments show almost a constant number of
iterations for our method, regardless of quark mass, and speedups of 8 over
original CG for light quark masses.Comment: 22 pages, 26 eps figure
- …