Artificial Neural Network Performance Model for Parallel Particle Transport Calculation

Herring, James Dillon

Artificial Neural Network Performance Model for Parallel Particle Transport Calculation

Authors: James Dillon Herring
Publication date: 24 January 2022
Publisher

Abstract

There is a need to improve the predictive capability of high-fidelity simulations of physical phenomena that include the transport of thermal radiation and/or other subatomic particles. There are many ingredients of improved capability, including solution algorithms that more efficiently use modern massively parallel computers. The most time-consuming element of many widely used particle-transport methods is the transport sweep, in which the particle intensity—a function of position, energy, and direction—is calculated given the most recent estimate for the collisional source. The intensity in a given spatial cell depends on the intensity entering from neighboring cells in the given direction, which imposes restrictions on the order of calculations and implies that cells must communicate exiting intensities to their downstream neighbors. Such dependen- cies and communication requirements make parallel execution more difficult. A parallel transport calculation in Texas A&M’s state-of-the-art PDT code partitions the spatial domain across pro- cessors as directed by partitioning parameters. It aggregates spatial cells into cellsets, directions into anglesets, and energy groups into groupsets, as directed by aggregation parameters. A single work unit during a sweep calculates particle intensities in a single cellset/angleset/groupset combi- nation. At each “stage” of the sweep every processor with available work executes one work unit and communicates outflow intensities to processors responsible for adjacent downstream cellsets. The ingredients of the “optimal sweep” methodology developed by Texas A&M in collaboration with the NNSA labs are: (i) a provably optimal scheduling algorithm, which executes the sweep in the minimum possible number of stages for any given partitioning and aggregation factors; (ii) a performance model that predicts sweep time for that execution; and (iii) an algorithm that chooses partitioning and aggregation factors that minimize sweep time. Here we explore the use of Arti- ficial Neural Networks (ANNs) for such a model, and its memory-use counterpart, and compare against our previous models. We design simple networks that have the ability to replicate pre- vious models but also to augment those models with nonlinear corrections if this better fits the data. These simple nonlinear ANNs outperform our previous models, reducing average prediction ii errors from ≈ 41% to ≈ 21% for some problems of interest, although large maximum errors are observed for both models. Additionally PDT reports unexpected results for parallel problems, pos- sibly contribution to the large maximum observed errors. Despite this observation, both the ANN based nonlinear model and our previous model show signs of fruitful practical use for an algorithm such as the one described in (iii). The memory-usage model shows promising results predicting memory usage within ≈ 0.024 GB for out of sample data points