13 research outputs found
High-Performance Solvers for Dense Hermitian Eigenproblems
We introduce a new collection of solvers - subsequently called EleMRRR - for
large-scale dense Hermitian eigenproblems. EleMRRR solves various types of
problems: generalized, standard, and tridiagonal eigenproblems. Among these,
the last is of particular importance as it is a solver on its own right, as
well as the computational kernel for the first two; we present a fast and
scalable tridiagonal solver based on the Algorithm of Multiple Relatively
Robust Representations - referred to as PMRRR. Like the other EleMRRR solvers,
PMRRR is part of the freely available Elemental library, and is designed to
fully support both message-passing (MPI) and multithreading parallelism (SMP).
As a result, the solvers can equally be used in pure MPI or in hybrid MPI-SMP
fashion. We conducted a thorough performance study of EleMRRR and ScaLAPACK's
solvers on two supercomputers. Such a study, performed with up to 8,192 cores,
provides precise guidelines to assemble the fastest solver within the ScaLAPACK
framework; it also indicates that EleMRRR outperforms even the fastest solvers
built from ScaLAPACK's components
An Optimized and Scalable Eigensolver for Sequences of Eigenvalue Problems
In many scientific applications the solution of non-linear differential
equations are obtained through the set-up and solution of a number of
successive eigenproblems. These eigenproblems can be regarded as a sequence
whenever the solution of one problem fosters the initialization of the next. In
addition, in some eigenproblem sequences there is a connection between the
solutions of adjacent eigenproblems. Whenever it is possible to unravel the
existence of such a connection, the eigenproblem sequence is said to be
correlated. When facing with a sequence of correlated eigenproblems the current
strategy amounts to solving each eigenproblem in isolation. We propose a
alternative approach which exploits such correlation through the use of an
eigensolver based on subspace iteration and accelerated with Chebyshev
polynomials (ChFSI). The resulting eigensolver is optimized by minimizing the
number of matrix-vector multiplications and parallelized using the Elemental
library framework. Numerical results show that ChFSI achieves excellent
scalability and is competitive with current dense linear algebra parallel
eigensolvers.Comment: 23 Pages, 6 figures. First revision of an invited submission to
special issue of Concurrency and Computation: Practice and Experienc
Fast computation of spectral projectors of banded matrices
We consider the approximate computation of spectral projectors for symmetric
banded matrices. While this problem has received considerable attention,
especially in the context of linear scaling electronic structure methods, the
presence of small relative spectral gaps challenges existing methods based on
approximate sparsity. In this work, we show how a data-sparse approximation
based on hierarchical matrices can be used to overcome this problem. We prove a
priori bounds on the approximation error and propose a fast algo- rithm based
on the QDWH algorithm, along the works by Nakatsukasa et al. Numerical
experiments demonstrate that the performance of our algorithm is robust with
respect to the spectral gap. A preliminary Matlab implementation becomes faster
than eig already for matrix sizes of a few thousand.Comment: 27 pages, 10 figure
Scaling the semidefinite program solver SDPB
We present enhancements to SDPB, an open source, parallelized, arbitrary
precision semidefinite program solver designed for the conformal bootstrap. The
main enhancement is significantly improved performance and scalability using
the Elemental library and MPI. The result is a new version of SDPB that runs on
multiple nodes with hundreds of cores with excellent scaling, making it
practical to solve larger problems. We demonstrate performance on a
moderate-size problem in the 3d Ising CFT and a much larger problem in the
Model.Comment: 13 pages plus references, 2 figure
Improved Accuracy and Parallelism for MRRR-based Eigensolvers -- A Mixed Precision Approach
The real symmetric tridiagonal eigenproblem is of outstanding importance in
numerical computations; it arises frequently as part of eigensolvers for
standard and generalized dense Hermitian eigenproblems that are based on a
reduction to tridiagonal form. For its solution, the algorithm of Multiple
Relatively Robust Representations (MRRR) is among the fastest methods. Although
fast, the solvers based on MRRR do not deliver the same accuracy as competing
methods like Divide & Conquer or the QR algorithm. In this paper, we
demonstrate that the use of mixed precisions leads to improved accuracy of
MRRR-based eigensolvers with limited or no performance penalty. As a result, we
obtain eigensolvers that are not only equally or more accurate than the best
available methods, but also -in most circumstances- faster and more scalable
than the competition
Rigorous optimisation of multilinear discriminant analysis with Tucker and PARAFAC structures
Abstract Background We propose rigorously optimised supervised feature extraction methods for multilinear data based on Multilinear Discriminant Analysis (MDA) and demonstrate their usage on Electroencephalography (EEG) and simulated data. While existing MDA methods use heuristic optimisation procedures based on an ambiguous Tucker structure, we propose a rigorous approach via optimisation on the cross-product of Stiefel manifolds. We also introduce MDA methods with the PARAFAC structure. We compare the proposed approaches to existing MDA methods and unsupervised multilinear decompositions. Results We find that manifold optimisation substantially improves MDA objective functions relative to existing methods and on simulated data in general improve classification performance. However, we find similar classification performance when applied to the electroencephalography data. Furthermore, supervised approaches substantially outperform unsupervised mulitilinear methods whereas methods with the PARAFAC structure perform similarly to those with Tucker structures. Notably, despite applying the MDA procedures to raw Brain-Computer Interface data, their performances are on par with results employing ample pre-processing and they extract discriminatory patterns similar to the brain activity known to be elicited in the investigated EEG paradigms. Conclusion The proposed usage of manifold optimisation constitutes the first rigorous and monotonous optimisation approach for MDA methods and allows for MDA with the PARAFAC structure. Our results show that MDA methods applied to raw EEG data can extract discriminatory patterns when compared to traditional unsupervised multilinear feature extraction approaches, whereas the proposed PARAFAC structured MDA models provide meaningful patterns of activity