139 research outputs found
Recommended from our members
An Introduction to Algebraic Multigrid
Algebraic multigrid (AMG) solves linear systems based on multigrid principles, but in a way that only depends on the coefficients in the underlying matrix. The author begins with a basic introduction to AMG methods, and then describes some more recent advances and theoretical development
Sparse matrix-vector multiplication on GPGPU clusters: A new storage format and a scalable implementation
Sparse matrix-vector multiplication (spMVM) is the dominant operation in many
sparse solvers. We investigate performance properties of spMVM with matrices of
various sparsity patterns on the nVidia "Fermi" class of GPGPUs. A new "padded
jagged diagonals storage" (pJDS) format is proposed which may substantially
reduce the memory overhead intrinsic to the widespread ELLPACK-R scheme. In our
test scenarios the pJDS format cuts the overall spMVM memory footprint on the
GPGPU by up to 70%, and achieves 95% to 130% of the ELLPACK-R performance.
Using a suitable performance model we identify performance bottlenecks on the
node level that invalidate some types of matrix structures for efficient
multi-GPGPU parallelization. For appropriate sparsity patterns we extend
previous work on distributed-memory parallel spMVM to demonstrate a scalable
hybrid MPI-GPGPU code, achieving efficient overlap of communication and
computation.Comment: 10 pages, 5 figures. Added reference to other recent sparse matrix
format
Parallel sparse matrix-vector multiplication as a test case for hybrid MPI+OpenMP programming
We evaluate optimized parallel sparse matrix-vector operations for two
representative application areas on widespread multicore-based cluster
configurations. First the single-socket baseline performance is analyzed and
modeled with respect to basic architectural properties of standard multicore
chips. Going beyond the single node, parallel sparse matrix-vector operations
often suffer from an unfavorable communication to computation ratio. Starting
from the observation that nonblocking MPI is not able to hide communication
cost using standard MPI implementations, we demonstrate that explicit overlap
of communication and computation can be achieved by using a dedicated
communication thread, which may run on a virtual core. We compare our approach
to pure MPI and the widely used "vector-like" hybrid programming strategy.Comment: 12 pages, 6 figure
Preconditioners for state constrained optimal control problems\ud with Moreau-Yosida penalty function tube
Optimal control problems with partial differential equations play an important role in many applications. The inclusion of bound constraints for the state poses a significant challenge for optimization methods. Our focus here is on the incorporation of the constraints via the Moreau-Yosida regularization technique. This method has been studied recently and has proven to be advantageous compared to other approaches. In this paper we develop preconditioners for the efficient solution of the Newton steps associated with the fast solution of the Moreau-Yosida regularized problem. Numerical results illustrate the competitiveness of this approach. \ud
\ud
Copyright c 2000 John Wiley & Sons, Ltd
Preconditioning for Allen-Cahn variational inequalities with non-local constraints
The solution of Allen-Cahn variational inequalities with mass constraints is of interest
in many applications. This problem can be solved both in its scalar and vector-valued form as a
PDE-constrained optimization problem by means of a primal-dual active set method. At the heart
of this method lies the solution of linear systems in saddle point form. In this paper we propose the
use of Krylov-subspace solvers and suitable preconditioners for the saddle point systems. Numerical
results illustrate the competitiveness of this approach
Preconditioners for state constrained optimal control problems with Moreau-Yosida penalty function
Optimal control problems with partial differential equations as constraints play an important role in many applications. The inclusion of bound constraints for the state variable poses a significant challenge for optimization methods. Our focus here is on the incorporation of the constraints via the Moreau-Yosida regularization technique. This method has been studied recently and has proven to be advantageous compared to other approaches. In this paper we develop robust preconditioners for the efficient solution of the Newton steps associated with solving the Moreau-Yosida regularized problem. Numerical results illustrate the efficiency of our approach
Hybrid-parallel sparse matrix-vector multiplication with explicit communication overlap on current multicore-based systems
We evaluate optimized parallel sparse matrix-vector operations for several
representative application areas on widespread multicore-based cluster
configurations. First the single-socket baseline performance is analyzed and
modeled with respect to basic architectural properties of standard multicore
chips. Beyond the single node, the performance of parallel sparse matrix-vector
operations is often limited by communication overhead. Starting from the
observation that nonblocking MPI is not able to hide communication cost using
standard MPI implementations, we demonstrate that explicit overlap of
communication and computation can be achieved by using a dedicated
communication thread, which may run on a virtual core. Moreover we identify
performance benefits of hybrid MPI/OpenMP programming due to improved load
balancing even without explicit communication overlap. We compare performance
results for pure MPI, the widely used "vector-like" hybrid programming
strategies, and explicit overlap on a modern multicore-based cluster and a Cray
XE6 system.Comment: 16 pages, 10 figure
All-at-once solution of time-dependent PDE-constrained optimization problems
Time-dependent partial differential equations (PDEs) play an important role in applied mathematics and many other areas of science. One-shot methods try to compute the solution to these problems in a single iteration that solves for all time-steps at the same time. In this paper, we look at one-shot approaches for the optimal control of time-dependent PDEs and focus on the fast solution of these problems. The use of Krylov subspace solvers together with an efficient preconditioner allows for minimal storage requirements. We solve only approximate time-evolutions for both forward and adjoint problem and compute accurate solutions of a given control problem only at convergence of the overall Krylov subspace iteration. We show that our approach can give competitive results for a variety of problem formulations
All-at-Once Solution if Time-Dependent PDE-Constrained Optimisation Problems
Time-dependent partial differential equations (PDEs) play an important role in applied mathematics and many other areas of science. One-shot methods try to compute the solution to these problems in a single iteration that solves for all time-steps at the same time. In this paper, we look at one-shot approaches for the optimal control of time-dependent PDEs and focus on the fast solution of these problems. The use of Krylov subspace solvers together with an efficient preconditioner allows for minimal storage requirements. We solve only approximate time-evolutions for both forward and adjoint problem and compute accurate solutions of a given control problem only at convergence of the overall Krylov subspace iteration. We show that our approach can give competitive results for a variety of problem formulations
- …