132 research outputs found

    Out-of-core macromolecular simulations on multithreaded architectures

    Get PDF
    We address the solution of large-scale eigenvalue problems that appear in the motion simulation of complex macromolecules on multithreaded platforms, consisting of multicore processors and possibly a graphics processor (GPU). In particular, we compare specialized implementations of several high- performance eigensolvers that, by relying on disk storage and out-of-core (OOC) techniques, can in principle tackle the large memory requirements of these biological problems, which in general do not fit into the main memory of current desktop machines. All these OOC eigensolvers, except for one, are composed of compute-bound (i.e., arithmetically-intensive) operations, which we accelerate by exploiting the performance of current multicore processors and, in some cases, by additionally off-loading certain parts of the computation to a GPU accelerator. One of the eigensolvers is a memory-bound algorithm, which strongly constrains its performance when the data is on disk. However, this method exhibits a much lower arithmetic cost compared with its compute- bound alternatives for this particular application. Experimental results on a desktop platform, representative of current server technology, illustrate the potential of these methods to address the simulation of biological activity

    Verified partial eigenvalue computations using contour integrals for Hermitian generalized eigenproblems

    Full text link
    We propose a verified computation method for partial eigenvalues of a Hermitian generalized eigenproblem. The block Sakurai-Sugiura Hankel method, a contour integral-type eigensolver, can reduce a given eigenproblem into a generalized eigenproblem of block Hankel matrices whose entries consist of complex moments. In this study, we evaluate all errors in computing the complex moments. We derive a truncation error bound of the quadrature. Then, we take numerical errors of the quadrature into account and rigorously enclose the entries of the block Hankel matrices. Each quadrature point gives rise to a linear system, and its structure enables us to develop an efficient technique to verify the approximate solution. Numerical experiments show that the proposed method outperforms a standard method and infer that the proposed method is potentially efficient in parallel.Comment: 15 pages, 4 figures, 1 tabl

    On implicit-factorization constraint preconditioners

    Get PDF
    Recently Dollar and Wathen [14] proposed a class of incomplete factorizations for saddle-point problems, based upon earlier work by Schilders [40]. In this paper, we generalize this class of preconditioners, and examine the spectral implications of our approach. Numerical tests indicate the efficacy of our preconditioners

    Parallel finite element density functional computations exploiting grid refinement and subspace recycling

    Full text link
    In this communication computational methods that facilitate finite element analysis of density functional computations are developed. They are: (i) h¿adaptive grid refinement techniques that reduce the total number of degrees of freedom in the real space grid while improving on the approximate resolution of the wanted solution; and (ii) subspace recycling of the approximate solution in self-consistent cycles with the aim of improving the performance of the generalized eigenproblem solver. These techniques are shown to give a convincing speed-up in the computation process by alleviating the overhead normally associated with computing systems with many degrees-of-freedom.The anonymous referees whose comments improved the presentation of this work are gratefully acknowledged. The work was supported by the Polish Ministry of Science and Higher Education N N519402837 and by the Spanish Ministry of Science and Innovation TIN2009-07519 and TIN2012-32846. The resources provided by the Barcelona Supercomputing Center are also acknowledged.Young, TD.; Romero Alcalde, E.; Román Moltó, JE. (2013). Parallel finite element density functional computations exploiting grid refinement and subspace recycling. Computer Physics Communications. 184(1):66-72. doi:10.1016/j.cpc.2012.08.011S6672184

    Adaptive BDDC in Three Dimensions

    Full text link
    The adaptive BDDC method is extended to the selection of face constraints in three dimensions. A new implementation of the BDDC method is presented based on a global formulation without an explicit coarse problem, with massive parallelism provided by a multifrontal solver. Constraints are implemented by a projection and sparsity of the projected operator is preserved by a generalized change of variables. The effectiveness of the method is illustrated on several engineering problems.Comment: 28 pages, 9 figures, 9 table

    KSPHPDDM and PCHPDDM: Extending PETSc with advanced Krylov methods and robust multilevel overlapping Schwarz preconditioners

    Full text link
    [EN] Contemporary applications in computational science and engineering often require the solution of linear systems which may be of different sizes, shapes, and structures. The goal of this paper is to explain how two libraries, PETSc and HPDDM, have been interfaced in order to offer end-users robust overlapping Schwarz preconditioners and advanced Krylov methods featuring recycling and the ability to deal with multiple right-hand sides. The flexibility of the implementation is showcased and explained with minimalist, easy-to-run, and reproducible examples, to ease the integration of these algorithms into more advanced frameworks. The examples provided cover applications from eigenanalysis, elasticity, combustion, and electromagnetism.Jose E. Roman was supported by the Spanish Agencia Estatal de Investigacion (AEI) under project SLEPc-DA (PID2019-107379RB-I00)Jolivet, P.; Roman, JE.; Zampini, S. (2021). KSPHPDDM and PCHPDDM: Extending PETSc with advanced Krylov methods and robust multilevel overlapping Schwarz preconditioners. Computers & Mathematics with Applications. 84:277-295. https://doi.org/10.1016/j.camwa.2021.01.0032772958

    Multi-patch discontinuous Galerkin isogeometric analysis for wave propagation: explicit time-stepping and efficient mass matrix inversion

    Full text link
    We present a class of spline finite element methods for time-domain wave propagation which are particularly amenable to explicit time-stepping. The proposed methods utilize a discontinuous Galerkin discretization to enforce continuity of the solution field across geometric patches in a multi-patch setting, which yields a mass matrix with convenient block diagonal structure. Over each patch, we show how to accurately and efficiently invert mass matrices in the presence of curved geometries by using a weight-adjusted approximation of the mass matrix inverse. This approximation restores a tensor product structure while retaining provable high order accuracy and semi-discrete energy stability. We also estimate the maximum stable timestep for spline-based finite elements and show that the use of spline spaces result in less stringent CFL restrictions than equivalent piecewise continuous or discontinuous finite element spaces. Finally, we explore the use of optimal knot vectors based on L2 n-widths. We show how the use of optimal knot vectors can improve both approximation properties and the maximum stable timestep, and present a simple heuristic method for approximating optimal knot positions. Numerical experiments confirm the accuracy and stability of the proposed methods

    A Fast Hierarchically Preconditioned Eigensolver Based on Multiresolution Matrix Decomposition

    Get PDF
    In this paper we propose a new iterative method to hierarchically compute a relatively large number of leftmost eigenpairs of a sparse symmetric positive matrix under the multiresolution operator compression framework. We exploit the well-conditioned property of every decomposition component by integrating the multiresolution framework into the implicitly restarted Lanczos method. We achieve this combination by proposing an extension-refinement iterative scheme, in which the intrinsic idea is to decompose the target spectrum into several segments such that the corresponding eigenproblem in each segment is well-conditioned. Theoretical analysis and numerical illustration are also reported to illustrate the efficiency and effectiveness of this algorithm
    corecore