Search CORE

3 research outputs found

A locality-based threading algorithm for the configuration-interaction method

Author: Shan H,
Publication venue
Publication date: 28/04/2017
Field of study

A locality-based threading algorithm for the configuration-interaction method

Author: Johnson C
McElvain K
Shan H
Williams S
Publication venue: eScholarship, University of California
Publication date: 30/06/2017
Field of study

The Configuration Interaction (CI) method has been widely used to solve the non-relativistic many-body Schrodinger equation. One great challenge to implementing it efficiently on manycore architectures is its immense memory and data movement requirements. To address this issue, within each node, we exploit a hybrid MPI+OpenMP programming model in lieu of the traditional flat MPI programming model. In this paper, we develop optimizations that partition the workloads among OpenMP threads based on data locality,-which is essential in ensuring applications with complex data access patterns scale well on manycore architectures. The new algorithm scales to 256 threadson the 64-core Intel Knights Landing (KNL) manycore processor and 24 threads on dual-socket Ivy Bridge (Xeon) nodes. Compared with the original implementation, the performance has been improved by up to 7× on theKnights Landing processor and 3× on the dual-socket Ivy Bridge node

Crossref

eScholarship - University of California