2,032 research outputs found
A GPU-based hyperbolic SVD algorithm
A one-sided Jacobi hyperbolic singular value decomposition (HSVD) algorithm,
using a massively parallel graphics processing unit (GPU), is developed. The
algorithm also serves as the final stage of solving a symmetric indefinite
eigenvalue problem. Numerical testing demonstrates the gains in speed and
accuracy over sequential and MPI-parallelized variants of similar Jacobi-type
HSVD algorithms. Finally, possibilities of hybrid CPU--GPU parallelism are
discussed.Comment: Accepted for publication in BIT Numerical Mathematic
Analysis of Schwarz methods for a hybridizable discontinuous Galerkin discretization
Schwarz methods are attractive parallel solvers for large scale linear
systems obtained when partial differential equations are discretized. For
hybridizable discontinuous Galerkin (HDG) methods, this is a relatively new
field of research, because HDG methods impose continuity across elements using
a Robin condition, while classical Schwarz solvers use Dirichlet transmission
conditions. Robin conditions are used in optimized Schwarz methods to get
faster convergence compared to classical Schwarz methods, and this even without
overlap, when the Robin parameter is well chosen. We present in this paper a
rigorous convergence analysis of Schwarz methods for the concrete case of
hybridizable interior penalty (IPH) method. We show that the penalization
parameter needed for convergence of IPH leads to slow convergence of the
classical additive Schwarz method, and propose a modified solver which leads to
much faster convergence. Our analysis is entirely at the discrete level, and
thus holds for arbitrary interfaces between two subdomains. We then generalize
the method to the case of many subdomains, including cross points, and obtain a
new class of preconditioners for Krylov subspace methods which exhibit better
convergence properties than the classical additive Schwarz preconditioner. We
illustrate our results with numerical experiments.Comment: 25 pages, 5 figures, 3 tables, accepted for publication in SINU
Regularized Jacobi iteration for decentralized convex optimization with separable constraints
We consider multi-agent, convex optimization programs subject to separable
constraints, where the constraint function of each agent involves only its
local decision vector, while the decision vectors of all agents are coupled via
a common objective function. We focus on a regularized variant of the so called
Jacobi algorithm for decentralized computation in such problems. We first
consider the case where the objective function is quadratic, and provide a
fixed-point theoretic analysis showing that the algorithm converges to a
minimizer of the centralized problem. Moreover, we quantify the potential
benefits of such an iterative scheme by comparing it against a scaled projected
gradient algorithm. We then consider the general case and show that all limit
points of the proposed iteration are optimal solutions of the centralized
problem. The efficacy of the proposed algorithm is illustrated by applying it
to the problem of optimal charging of electric vehicles, where, as opposed to
earlier approaches, we show convergence to an optimal charging scheme for a
finite, possibly large, number of vehicles
GHOST: Building blocks for high performance sparse linear algebra on heterogeneous systems
While many of the architectural details of future exascale-class high
performance computer systems are still a matter of intense research, there
appears to be a general consensus that they will be strongly heterogeneous,
featuring "standard" as well as "accelerated" resources. Today, such resources
are available as multicore processors, graphics processing units (GPUs), and
other accelerators such as the Intel Xeon Phi. Any software infrastructure that
claims usefulness for such environments must be able to meet their inherent
challenges: massive multi-level parallelism, topology, asynchronicity, and
abstraction. The "General, Hybrid, and Optimized Sparse Toolkit" (GHOST) is a
collection of building blocks that targets algorithms dealing with sparse
matrix representations on current and future large-scale systems. It implements
the "MPI+X" paradigm, has a pure C interface, and provides hybrid-parallel
numerical kernels, intelligent resource management, and truly heterogeneous
parallelism for multicore CPUs, Nvidia GPUs, and the Intel Xeon Phi. We
describe the details of its design with respect to the challenges posed by
modern heterogeneous supercomputers and recent algorithmic developments.
Implementation details which are indispensable for achieving high efficiency
are pointed out and their necessity is justified by performance measurements or
predictions based on performance models. The library code and several
applications are available as open source. We also provide instructions on how
to make use of GHOST in existing software packages, together with a case study
which demonstrates the applicability and performance of GHOST as a component
within a larger software stack.Comment: 32 pages, 11 figure
- …