12 research outputs found
Partitioning Techniques and Their Parallelization for Stiff System of Ordinary Differential Equations
A new code based on variable order and variable stepsize component wise
partitioning is introduced to solve a system of equations dynamically. In previous
partitioning technique researches, once an equation is identified as stiff, it will
remain in stiff subsystem until the integration is completed. In this current
technique, the system is treated as nonstiff and any equation that caused stiffness
will be treated as stiff equation. However, should the characteristics showed the
elements of nonstiffness, and then it will be treated again with Adam method. This
process will continue switching from stiff to nonstiff vice versa whenever it is
necessary until the interval of integration is completed.Next, a block method with R-points generate R new approximate solution values;is
a strategy for solving a system and also for parallelizing ODEs. Partitioning this
block method to solve stiff differential equations is a new strategy; it is more
efficient and takes less computational time compared to the sequential methods.
Two partitioning techniques are constructed, Intervalwise Block Partitioning (IBP)
and Componentwise Block Partitioning (CBP). Numerical results are compared as
validation of its effectiveness.
Intervalwise block partitioning will initially treat the systems of equations as
nonstiff and solve them using Adams method, by switching to the Backward
Differentiation formula when there is a step failure and indication of stiffness.
Componentwise block partitioning will place the necessary equations that cause
instability and stiffness into the stiff subsystem and solve using Backward
Differentiation Formula, while all other equations will still be treated as non-stiff
and solved using Adams formula.
Parallelizing the partitioning strategies using Message Passing Interface (MPI) is
the most appropriate method to solve large system of equations. Parallelizing the
right algorithm in the partitioning code will give a better perfonnance with shorter
execution times. The graphs of its performance and execution time, visualize the
advantages of parallelizing
Recommended from our members
The effect of FPU architecture on a dynamic precision algorithm for the solution of differential equations
Solution of lnitial Value Problems (IVPs) is an important application in scientific computing. Methods for solving these problems use techniques for reducing the error and increasing the speed of the computation. This paper introduces a class of algorithms which dynamically reconfigure their operating parameters to reduce the computation time. By dynamically varying the precision of the arithmetic being performed, it is possible to obtain dramatic speedups on certain architectures when solving IVPs. This paper illustrates how various architectures impact on a dynamic precision version of the Runge-Kutta-Fehlberg algorithm. It is shown that a speedup of over 30 percent is possible for both massively parallel processors and vector supercomputers
Systematic construction of efficient six-stage fifth-order explicit Runge-Kutta embedded pairs without standard simplifying assumptions
This thesis examines methodologies and software to construct explicit
Runge-Kutta (ERK) pairs for solving initial value problems (IVPs) by
constructing efficient six-stage fifth-order ERK pairs without
standard simplifying assumptions. The problem of whether efficient
higher-order ERK pairs can be constructed algebraically without the
standard simplifying assumptions dates back to at least the 1960s,
with Cassity's complete solution of the six-stage fifth-order order
conditions. Although RK methods based on the six-stage fifth-order
order conditions have been widely studied and have continuing
practical importance, prior to this thesis, the aforementioned
complete solution to these order conditions has no published usage
beyond the original series of publications by Cassity in the 1960s.
The complete solution of six-stage fifth-order ERK order conditions
published by Cassity in 1969 is not in a formulation that can easily
be used for practical purposes, such as a software implementation.
However, it is shown in this thesis that when the order conditions are
solved and formulated appropriately using a computer algebra system
(CAS), the generated code can be used for practical purposes and the
complete solution is readily extended to ERK pairs. The condensed
matrix form of the order conditions introduced by Cassity in 1969 is
shown to be an ideal methodology, which probably has wider
applicability, for solving order conditions using a CAS. The software
package OCSage developed for this thesis, in order to solve the order
conditions and study the properties of the resulting methods, is built
on top of the Sage CAS.
However, in order to effectively determine that the constructed ERK
pairs without standard simplifying assumptions are in fact efficient
by some well-defined criteria, the process of selecting the
coefficients of ERK pairs is re-examined in conjunction with a
sufficient amount of performance data. The pythODE software package
developed for this thesis is used to generate a large amount of
performance data from a large selection of candidate ERK pairs found
using OCSage. In particular, it is shown that there is unlikely to be
a well-defined methodology for selecting optimal pairs for
general-purpose use, other than avoiding poor choices of certain
properties and ensuring the error coefficients are as small as
possible. However, for IVPs from celestial mechanics, there are
obvious optimal pairs that have specific values of a small subset of
the principal error coefficients (PECs). Statements seen in the
literature that the best that can be done is treating all PECs equally
do not necessarily apply to at least some broad classes of IVPs. By
choosing ERK pairs based on specific values of individual PECs, not
only are ERK pairs that are 20-30% more efficient than comparable
published pairs found for test sets of IVPs from celestial mechanics,
but the variation in performance between the best and worst ERK pairs
that otherwise would seem to have similar properties is reduced from a
factor of 2 down to as low as 15%. Based on observations of the small
number of IVPs of other classes in common IVP test sets, there are
other classes of IVPs that have different optimal values of the PECs.
A more general contribution of this thesis is that it specifically
demonstrates how specialized software tools and a larger amount of
performance data than is typical can support novel empirical insights
into numerical methods
Recommended from our members
Using SIMD and SIMT vectorization to evaluate sparse chemical kinetic Jacobian matrices and thermochemical source terms
Accurately predicting key combustion phenomena in reactive-flow simulations, e.g., lean blow-out, extinction/ignition limits and pollutant formation, necessitates the use of detailed chemical kinetics. The large size and high levels of numerical stiffness typically present in chemical kinetic models relevant to transportation/power-generation applications make the efficient evaluation/factorization of the chemical kinetic Jacobian and thermochemical source-terms critical to the performance of reactive-flow codes. Here we investigate the performance of vectorized evaluation of constant-pressure/volume thermochemical source-term and sparse/dense chemical kinetic Jacobians using single-instruction, multiple-data (SIMD) and single-instruction, multiple thread (SIMT) paradigms. These are implemented in pyJac, an open-source, reproducible code generation platform. Selected chemical kinetic models covering the range of sizes typically used in reactive-flow simulations were used for demonstration. A new formulation of the chemical kinetic governing equations was derived and verified, resulting in Jacobian sparsities of 28.6-92.0% for the tested models. Speedups of 3.40-4.08 x were found for shallow-vectorized OpenCL source-rate evaluation compared with a parallel OpenMP code on an avx2 central processing unit (CPU), increasing to 6.63-9.44 x and 3.03-4.23 x for sparse and dense chemical kinetic Jacobian evaluation, respectively. Furthermore, the effect of data-ordering was investigated and a storage pattern specifically formulated for vectorized evaluation was proposed; as well, the effect of the constant pressure/volume assumptions and varying vector widths were studied on source-term evaluation performance. Speedups reached up to 17.60 x and 45.13 x for dense and sparse evaluation on the GPU, and up to 55.11 x and 245.63 x on the CPU over a first-order finite-difference Jacobian approach. Further, dense Jacobian evaluation was up to 19.56 x and 2.84 x times faster than a previous version of pyJac on a CPU and GPU, respectively. Finally, future directions for vectorized chemical kinetic evaluation and sparse linear-algebra techniques were discussed. (C) 2018 The Combustion Institute. Published by Elsevier Inc. All rights reserved