4 research outputs found

    Semi-analytic integration for a parallel space-time boundary element method modelling the heat equation

    Get PDF
    The presented paper concentrates on the boundary element method (BEM) for the heat equation in three spatial dimensions. In particular, we deal with tensor product space-time meshes allowing for quadrature schemes analytic in time and numerical in space. The spatial integrals can be treated by standard BEM techniques known from three dimensional stationary problems. The contribution of the paper is twofold. First, we provide temporal antiderivatives of the heat kernel necessary for the assembly of BEM matrices and the evaluation of the representation formula. Secondly, the presented approach has been implemented in a publicly available library besthea allowing researchers to reuse the formulae and BEM routines straightaway. The results are validated by numerical experiments in an HPC environment.Web of Science10317015

    Boundary element quadrature schemes for multi- and many-core architectures

    No full text
    In the paper we study the performance of the regularized boundary element quadrature routines implemented in the BEM4I library developed by the authors. Apart from the results obtained on the classical multi-core architecture represented by the Intel Xeon processors we concentrate on the portability of the code to the many-core family Intel Xeon Phi. Contrary to the GP-GPU programming accelerating many scientific codes, the standard x86 architecture of the Xeon Phi processors allows to reuse the already existing multi-core implementation. Although in many cases a simple recompilation would lead to an inefficient utilization of the Xeon Phi, the effort invested in the optimization usually leads to a better performance on the multi-core Xeon processors as well. This makes the Xeon Phi an interesting platform for scientists developing a software library aimed at both modern portable PCs and high performance computing environments. Here we focus at the manually vectorized assembly of the local element contributions and the parallel assembly of the global matrices on shared memory systems. Due to the quadratic complexity of the standard assembly we also present an assembly sparsified by the adaptive cross approximation based on the same acceleration techniques. The numerical results performed on the Xeon multi-core processor and two generations of the Xeon Phi many-core platform validate the proposed implementation and highlight the importance of vectorization necessary to exploit the features of modern hardware.Web of Science74117315
    corecore