27 research outputs found

    Asynchronous and corrected-asynchronous numerical solutions of parabolic PDES on MIMD multiprocessors

    Get PDF
    A major problem in achieving significant speed-up on parallel machines is the overhead involved with synchronizing the concurrent process. Removing the synchronization constraint has the potential of speeding up the computation. The authors present asynchronous (AS) and corrected-asynchronous (CA) finite difference schemes for the multi-dimensional heat equation. Although the discussion concentrates on the Euler scheme for the solution of the heat equation, it has the potential for being extended to other schemes and other parabolic partial differential equations (PDEs). These schemes are analyzed and implemented on the shared memory multi-user Sequent Balance machine. Numerical results for one and two dimensional problems are presented. It is shown experimentally that the synchronization penalty can be about 50 percent of run time: in most cases, the asynchronous scheme runs twice as fast as the parallel synchronous scheme. In general, the efficiency of the parallel schemes increases with processor load, with the time level, and with the problem dimension. The efficiency of the AS may reach 90 percent and over, but it provides accurate results only for steady-state values. The CA, on the other hand, is less efficient, but provides more accurate results for intermediate (non steady-state) values

    Evaluation of finite difference based asynchronous partial differential equations solver for reacting flows

    Full text link
    Next-generation exascale machines with extreme levels of parallelism will provide massive computing resources for large scale numerical simulations of complex physical systems at unprecedented parameter ranges. However, novel numerical methods, scalable algorithms and re-design of current state-of-the art numerical solvers are required for scaling to these machines with minimal overheads. One such approach for partial differential equations based solvers involves computation of spatial derivatives with possibly delayed or asynchronous data using high-order asynchrony-tolerant (AT) schemes to facilitate mitigation of communication and synchronization bottlenecks without affecting the numerical accuracy. In the present study, an effective methodology of implementing temporal discretization using a multi-stage Runge-Kutta method with AT schemes is presented. Together these schemes are used to perform asynchronous simulations of canonical reacting flow problems, demonstrated in one-dimension including auto-ignition of a premixture, premixed flame propagation and non-premixed autoignition. Simulation results show that the AT schemes incur very small numerical errors in all key quantities of interest including stiff intermediate species despite delayed data at processing element (PE) boundaries. For simulations of supersonic flows, the degraded numerical accuracy of well-known shock-resolving WENO (weighted essentially non-oscillatory) schemes when used with relaxed synchronization is also discussed. To overcome this loss of accuracy, high-order AT-WENO schemes are derived and tested on linear and non-linear equations. Finally the novel AT-WENO schemes are demonstrated in the propagation of a detonation wave with delays at PE boundaries

    Cumulative reports and publications

    Get PDF
    A complete list of Institute for Computer Applications in Science and Engineering (ICASE) reports are listed. Since ICASE reports are intended to be preprints of articles that will appear in journals or conference proceedings, the published reference is included when it is available. The major categories of the current ICASE research program are: applied and numerical mathematics, including numerical analysis and algorithm development; theoretical and computational research in fluid mechanics in selected areas of interest to LaRC, including acoustics and combustion; experimental research in transition and turbulence and aerodynamics involving LaRC facilities and scientists; and computer science

    Cumulative reports and publications through December 31, 1990

    Get PDF
    This document contains a complete list of ICASE reports. Since ICASE reports are intended to be preprints of articles that will appear in journals or conference proceedings, the published reference is included when it is available

    Highly Scalable Asynchronous Computing Method for Partial Differential Equations: A Path Towards Exascale

    Get PDF
    Many natural and engineering systems are governed by nonlinear partial differential equations (PDEs) which result in a multiscale phenomena, e.g. turbulent flows. Numerical simulations of these problems are computationally very expensive and demand for extreme levels of parallelism. At realistic conditions, simulations are being carried out on massively parallel computers with hundreds of thousands of processing elements (PEs). It has been observed that communication between PEs as well as their synchronization at these extreme scales take up a significant portion of the total simulation time and result in poor scalability of codes. This issue is likely to pose a bottleneck in scalability of codes on future Exascale systems. In this work, we propose an asynchronous computing algorithm based on widely used finite difference methods to solve PDEs in which synchronization between PEs due to communication is relaxed at a mathematical level. We show that while stability is conserved when schemes are used asynchronously, accuracy is greatly degraded. Since message arrivals at PEs are random processes, so is the behavior of the error. We propose a new statistical framework in which we show that average errors drop always to first-order regardless of the original scheme. We propose new asynchrony-tolerant schemes that maintain accuracy when synchronization is relaxed. The quality of the solution is shown to depend, not only on the physical phenomena and numerical schemes, but also on the characteristics of the computing machine. A novel algorithm using remote memory access communications has been developed to demonstrate excellent scalability of the method for large-scale computing. Finally, we present a path to extend this method in solving complex multi-scale problems on Exascale machines

    Development of a Navier-Stokes algorithm for parallel-processing supercomputers

    Get PDF
    An explicit flow solver, applicable to the hierarchy of model equations ranging from Euler to full Navier-Stokes, is combined with several techniques designed to reduce computational expense. The computational domain consists of local grid refinements embedded in a global coarse mesh, where the locations of these refinements are defined by the physics of the flow. Flow characteristics are also used to determine which set of model equations is appropriate for solution in each region, thereby reducing not only the number of grid points at which the solution must be obtained, but also the computational effort required to get that solution. Acceleration to steady-state is achieved by applying multigrid on each of the subgrids, regardless of the particular model equations being solved. Since each of these components is explicit, advantage can readily be taken of the vector- and parallel-processing capabilities of machines such as the Cray X-MP and Cray-2

    Cumulative reports and publications through December 31, 1989

    Get PDF
    A complete list of reports from the Institute for Computer Applications in Science and Engineering (ICASE) is presented. The major categories of the current ICASE research program are: numerical methods, with particular emphasis on the development and analysis of basic numerical algorithms; control and parameter identification problems, with emphasis on effectual numerical methods; computational problems in engineering and the physical sciences, particularly fluid dynamics, acoustics, structural analysis, and chemistry; computer systems and software, especially vector and parallel computers, microcomputers, and data management. Since ICASE reports are intended to be preprints of articles that will appear in journals or conference proceedings, the published reference is included when it is available

    Cumulative reports and publications through December 31, 1988

    Get PDF
    This document contains a complete list of ICASE Reports. Since ICASE Reports are intended to be preprints of articles that will appear in journals or conference proceedings, the published reference is included when it is available

    Towards Asynchronous Simulations of Turbulent Flows: Accuracy, Performance, and Optimization

    Get PDF
    Our understanding of turbulence has heavily relied on high-fidelity Direct Numerical Simulations (DNS) that resolve all dynamically relevant scales. But because of the inherent complexities of turbulent flows, these simulations are computationally very expensive and practically impossible at realistic conditions. Advancements in high performance computing provided much needed boost to the computational resources through increasing levels of parallelism and made DNS realizable, even though only in a limited parameter range. As the number of processing elements (PEs) in parallel machines increases, the penalties incurred in current algorithms due to necessary communications and synchronizations between PEs to update data become significant. These overheads are expected to pose a serious challenge to scalability on the next-generation exascale machines. An effective way to mitigate this bottleneck is through relaxation of strict communication and synchronization constraints and proceed with computations asynchronously i.e. without waiting for updated information from the other PEs. In this work, we investigate the viability of such asynchronous computing using high-order Asynchrony-Tolerant (AT) schemes for accurate and scalable simulations of reacting and non-reacting turbulence at extreme scales. For this, we first assess the important numerical properties of AT schemes, including conservation, stability, and spectral accuracy. Through rigorous mathematical analysis, we expose the breakdown of the standard von Neumann analysis for stability of multi-level schemes, even for widely used synchronous schemes. We overcome these limitations through what we call the generalized von Neumann analysis that is then used to assess stability of the AT schemes. Following which, we propose and implement two computational algorithms to introduce asynchrony in a three-dimensional compressible flow solver. We use these to perform first of a kind asynchronous simulation of compressible turbulence and analyze the effect of asynchrony on important physical characteristics of turbulence. Specifically we show that both large-scale and scale-scale features including highly intermittent instantaneous events, are accurately resolved by these algorithms. We also show excellent strong and weak scaling of asynchronous algorithms up to a processor count of P = 262144 because of significant reduction in communication overheads. As a precursor to the development of asynchronous combustion codes for simulations of more challenging problems with additional physical and numerical complexities, we investigate the effect of asynchrony on several canonical reacting flows. Furthermore, for problems with shocks and discontinuities, such as detonations, we derive and verify AT-WENO (weighted essentially non-oscillatory) schemes. With the ultimate goal to derive new optimal AT schemes we also develop a unified framework for the derivation of finite difference schemes. We show explicit trade-offs between order of accuracy, spectral accuracy and stability under this unifying framework, which can be exploited to devise very accurate numerical schemes for asynchronous computations on extreme scales with minimal overheads
    corecore