15,531 research outputs found
The automatic solution of partial differential equations using a global spectral method
A spectral method for solving linear partial differential equations (PDEs)
with variable coefficients and general boundary conditions defined on
rectangular domains is described, based on separable representations of partial
differential operators and the one-dimensional ultraspherical spectral method.
If a partial differential operator is of splitting rank , such as the
operator associated with Poisson or Helmholtz, the corresponding PDE is solved
via a generalized Sylvester matrix equation, and a bivariate polynomial
approximation of the solution of degree is computed in
operations. Partial differential operators of
splitting rank are solved via a linear system involving a block-banded
matrix in operations. Numerical
examples demonstrate the applicability of our 2D spectral method to a broad
class of PDEs, which includes elliptic and dispersive time-evolution equations.
The resulting PDE solver is written in MATLAB and is publicly available as part
of CHEBFUN. It can resolve solutions requiring over a million degrees of
freedom in under seconds. An experimental implementation in the Julia
language can currently perform the same solve in seconds.Comment: 22 page
Geometry-Oblivious FMM for Compressing Dense SPD Matrices
We present GOFMM (geometry-oblivious FMM), a novel method that creates a
hierarchical low-rank approximation, "compression," of an arbitrary dense
symmetric positive definite (SPD) matrix. For many applications, GOFMM enables
an approximate matrix-vector multiplication in or even time,
where is the matrix size. Compression requires storage and work.
In general, our scheme belongs to the family of hierarchical matrix
approximation methods. In particular, it generalizes the fast multipole method
(FMM) to a purely algebraic setting by only requiring the ability to sample
matrix entries. Neither geometric information (i.e., point coordinates) nor
knowledge of how the matrix entries have been generated is required, thus the
term "geometry-oblivious." Also, we introduce a shared-memory parallel scheme
for hierarchical matrix computations that reduces synchronization barriers. We
present results on the Intel Knights Landing and Haswell architectures, and on
the NVIDIA Pascal architecture for a variety of matrices.Comment: 13 pages, accepted by SC'1
Algorithmic patterns for -matrices on many-core processors
In this work, we consider the reformulation of hierarchical ()
matrix algorithms for many-core processors with a model implementation on
graphics processing units (GPUs). matrices approximate specific
dense matrices, e.g., from discretized integral equations or kernel ridge
regression, leading to log-linear time complexity in dense matrix-vector
products. The parallelization of matrix operations on many-core
processors is difficult due to the complex nature of the underlying algorithms.
While previous algorithmic advances for many-core hardware focused on
accelerating existing matrix CPU implementations by many-core
processors, we here aim at totally relying on that processor type. As main
contribution, we introduce the necessary parallel algorithmic patterns allowing
to map the full matrix construction and the fast matrix-vector
product to many-core hardware. Here, crucial ingredients are space filling
curves, parallel tree traversal and batching of linear algebra operations. The
resulting model GPU implementation hmglib is the, to the best of the authors
knowledge, first entirely GPU-based Open Source matrix library of
this kind. We conclude this work by an in-depth performance analysis and a
comparative performance study against a standard matrix library,
highlighting profound speedups of our many-core parallel approach
Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions
Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or
implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition of an m × n matrix. (i) For a dense input matrix, randomized algorithms require O(mn log(k))
floating-point operations (flops) in contrast to O(mnk) for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multiprocessor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to O(k) passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data
Efficient Randomized Algorithms for the Fixed-Precision Low-Rank Matrix Approximation
Randomized algorithms for low-rank matrix approximation are investigated,
with the emphasis on the fixed-precision problem and computational efficiency
for handling large matrices. The algorithms are based on the so-called QB
factorization, where Q is an orthonormal matrix. Firstly, a mechanism for
calculating the approximation error in Frobenius norm is proposed, which
enables efficient adaptive rank determination for large and/or sparse matrix.
It can be combined with any QB-form factorization algorithm in which B's rows
are incrementally generated. Based on the blocked randQB algorithm by P.-G.
Martinsson and S. Voronin, this results in an algorithm called randQB EI. Then,
we further revise the algorithm to obtain a pass-efficient algorithm, randQB
FP, which is mathematically equivalent to the existing randQB algorithms and
also suitable for the fixed-precision problem. Especially, randQB FP can serve
as a single-pass algorithm for calculating leading singular values, under
certain condition. With large and/or sparse test matrices, we have empirically
validated the merits of the proposed techniques, which exhibit remarkable
speedup and memory saving over the blocked randQB algorithm. We have also
demonstrated that the single-pass algorithm derived by randQB FP is much more
accurate than an existing single-pass algorithm. And with data from a scenic
image and an information retrieval application, we have shown the advantages of
the proposed algorithms over the adaptive range finder algorithm for solving
the fixed-precision problem.Comment: 21 pages, 10 figure
Modeling of Spatial Uncertainties in the Magnetic Reluctivity
In this paper a computationally efficient approach is suggested for the
stochastic modeling of an inhomogeneous reluctivity of magnetic materials.
These materials can be part of electrical machines, such as a single phase
transformer (a benchmark example that is considered in this paper). The
approach is based on the Karhunen-Lo\`{e}ve expansion. The stochastic model is
further used to study the statistics of the self inductance of the primary coil
as a quantity of interest.Comment: submitted to COMPE
- …