2,808 research outputs found

    Domain Decomposition Based High Performance Parallel Computing\ud

    Get PDF
    The study deals with the parallelization of finite element based Navier-Stokes codes using domain decomposition and state-ofart sparse direct solvers. There has been significant improvement in the performance of sparse direct solvers. Parallel sparse direct solvers are not found to exhibit good scalability. Hence, the parallelization of sparse direct solvers is done using domain decomposition techniques. A highly efficient sparse direct solver PARDISO is used in this study. The scalability of both Newton and modified Newton algorithms are tested

    Effective data parallel computing on multicore processors

    Get PDF
    The rise of chip multiprocessing or the integration of multiple general purpose processing cores on a single chip (multicores), has impacted all computing platforms including high performance, servers, desktops, mobile, and embedded processors. Programmers can no longer expect continued increases in software performance without developing parallel, memory hierarchy friendly software that can effectively exploit the chip level multiprocessing paradigm of multicores. The goal of this dissertation is to demonstrate a design process for data parallel problems that starts with a sequential algorithm and ends with a high performance implementation on a multicore platform. Our design process combines theoretical algorithm analysis with practical optimization techniques. Our target multicores are quad-core processors from Intel and the eight-SPE IBM Cell B.E. Target applications include Matrix Multiplications (MM), Finite Difference Time Domain (FDTD), LU Decomposition (LUD), and Power Flow Solver based on Gauss-Seidel (PFS-GS) algorithms. These applications are popular computation methods in science and engineering problems and are characterized by unit-stride (MM, LUD, and PFS-GS) or 2-point stencil (FDTD) memory access pattern. The main contributions of this dissertation include a cache- and space-efficient algorithm model, integrated data pre-fetching and caching strategies, and in-core optimization techniques. Our multicore efficient implementations of the above described applications outperform nai¨ve parallel implementations by at least 2x and scales well with problem size and with the number of processing cores

    Newton-like Methods for Navier-Stokes Solution. G.U. Aero Report 9229

    Get PDF
    The paper reports on Newton-like methods called SFDN-a-GMRES and SQN-a-GMRES methods that have been devised and proven as powerful schemes for large nonlinear problems typical of viscous compressible Navier-Stokes solutions. They can be applied using a partially converged solution from a conventional explicit or approximate implicit method. Developments have included the efficient parallelisation of the schemes on a distributed memory parallel computer. The methods are illustrated using a RISC workstation and a transputer parallel system respectively to solve a hypersonic vortical flow

    Constraint programming on a heterogeneous multicore architecture

    Get PDF
    As bibliotecas para programação com restrições são úteis ao desenvolverem-se aplicações em linguagens de programação normalmente mais utilizadas pois não necessitam que os programadores aprendam uma. Nova, linguagem, fornecendo ferramentas de programação declarativa para utilização com os sistemas convencionais. Algumas soluções para programação com restrições favorecem completude, tais como sistemas baseados em propagação. Outras estão mais interessadas em obter uma boa solução rapidamente, rejeitando a necessidade de encontram todas as soluções; esta sendo a alternativa utilizada nos sistemas de pesquisa local. Conceber soluções híbridas (propagação + pesquisa local) parece prometedor pois as vantagens de ambas alternativas podem ser combinadas numa única solução. As arquiteturas paralelas são cada vez mais comuns, em parte devido à disponibilidade em grande escala, de sistemas individuais mas também devido à tendência em generalizar o uso de processadores multicore ou seja., processadores com várias unidades de processamento. Nesta tese é proposta uma. Arquitetura para resolvedores de restrições mistos, de pendendo de métodos de propagação e pesquisa local, a qual foi concebida para funcionar eficazmente numa arquitetura. Heterogéneo multiprocessador. /ABSTRACT - Constraint programming libraries are useful when building applications developed mostly in mainstrearn programming languages: they do not require the developers to acquire skills for a new language, providing instead declarative programming tools for use within conventional systems. Some approaches to constraint programming favour completeness, such as propagation-based systems. Others are more interested in getting to a good solution fast, regardless of whether all solutions may be found; this approach is used in local search systems. Designing hybrid approaches (propagation + local search) seems promising since the advantages may be combined into a single approach. Parallel architectures are becoming more commonplace, partly due to the large-scale availability of individual systems but also because of the trend towards generalizing the use of multicore microprocessors. In this thesis an architecture for mixed constraint solvers is proposed, relying both on propagation and local search, which is designed to function effectively in a heterogeneous multicore architecture

    Parallelization of Explicit and Fully Implicit Navier-Stokes Solutions for Compressible Flows. G.U. Aero Report 9236

    Get PDF
    The paper describes two studies involved with the parallelisation of algorithms for the numerical calculation of hypersonic viscous flows over generic vehicle configurations by solving the Navier-Stokes equations. One involved a scalable explicit formulation that achieved high parallel efficiency on 32 processors of an Intel iPSC860 Hypercube when calculating the 3-dimensional flow over a blunt delta wing at high incidence, the other involved a fully implicit formulation using a Newton-like procedure with a GMRES solver with pre-conditioning when high efficiency was achieved when run on 8 transputers of a Meiko Computing Surface computer in order to calculate the flow over a cone at high incidence. This latter approach, although more complex, has potential in providing more rapid convergence characteristics, hence efficiency, than the explicit scheme. High order upwind discretisation was used in each case in order to achieve high resolution of important shock and viscous phenomenon within the flow field. The work reported contributes to the aim of a wider programme of work, a summary of which is included, to provide the computational tools to calculate accurately and efficiently steady and unsteady viscous compressible flows over complex aerospace configurations

    Parallel Newton Method for High-Speed Viscous Separated Flowfields. G.U. Aero Report 9210

    Get PDF
    This paper presents a new technique to parallelize Newton method for the locally conical approximate, laminar Navier-Stokes solutions on a distributed memory parallel computer. The method uses Newton's method for nonlinear systems of equations to find steady-state solutions. The parallelization is based on a parallel iterative solver for large sparse non-symmetric linear system. The method of distributed storage of the matrix data results in the corresponding geometric domain decomposition. The large sparse Jacobian matrix is then generated distributively in each subdomain. Since the numerical algorithms on the global domain are unchanged, the convergence and the accuracy of the original sequential scheme are maintained, and no inner boundary condition is needed

    ADER-WENO Finite Volume Schemes with Space-Time Adaptive Mesh Refinement

    Full text link
    We present the first high order one-step ADER-WENO finite volume scheme with Adaptive Mesh Refinement (AMR) in multiple space dimensions. High order spatial accuracy is obtained through a WENO reconstruction, while a high order one-step time discretization is achieved using a local space-time discontinuous Galerkin predictor method. Due to the one-step nature of the underlying scheme, the resulting algorithm is particularly well suited for an AMR strategy on space-time adaptive meshes, i.e.with time-accurate local time stepping. The AMR property has been implemented 'cell-by-cell', with a standard tree-type algorithm, while the scheme has been parallelized via the Message Passing Interface (MPI) paradigm. The new scheme has been tested over a wide range of examples for nonlinear systems of hyperbolic conservation laws, including the classical Euler equations of compressible gas dynamics and the equations of magnetohydrodynamics (MHD). High order in space and time have been confirmed via a numerical convergence study and a detailed analysis of the computational speed-up with respect to highly refined uniform meshes is also presented. We also show test problems where the presented high order AMR scheme behaves clearly better than traditional second order AMR methods. The proposed scheme that combines for the first time high order ADER methods with space--time adaptive grids in two and three space dimensions is likely to become a useful tool in several fields of computational physics, applied mathematics and mechanics.Comment: With updated bibliography informatio

    Simulating radiative shocks in nozzle shock tubes

    Full text link
    We use the recently developed Center for Radiative Shock Hydrodynamics (CRASH) code to numerically simulate laser-driven radiative shock experiments. These shocks are launched by an ablated beryllium disk and are driven down xenon-filled plastic tubes. The simulations are initialized by the two-dimensional version of the Lagrangian Hyades code which is used to evaluate the laser energy deposition during the first 1.1ns. The later times are calculated with the CRASH code. This code solves for the multi-material hydrodynamics with separate electron and ion temperatures on an Eulerian block-adaptive-mesh and includes a multi-group flux-limited radiation diffusion and electron thermal heat conduction. The goal of the present paper is to demonstrate the capability to simulate radiative shocks of essentially three-dimensional experimental configurations, such as circular and elliptical nozzles. We show that the compound shock structure of the primary and wall shock is captured and verify that the shock properties are consistent with order-of-magnitude estimates. The produced synthetic radiographs can be used for comparison with future nozzle experiments at high-energy-density laser facilities.Comment: submitted to High Energy Density Physic
    corecore