13 research outputs found

    Puffin : A three dimensional, unaveraged free electron laser simulation code

    Get PDF
    The broadband, 3D FEL code Puffin is presented. The analytical model is derived in absence of the Slowly Varying Envelope Approximation, and can model undulators of any polarisation. Due to the enhanced resolution, the memory and processing requirements are greater than equivalent unaveraged codes. The numerical code to solve the system of equations is therefore written for a parallel computing environment utilizing MPI. Some example simulations are presented

    Efficient discontinuous finite difference meshes for 3-D Laplace-Fourier domain seismic wavefield modelling in acoustic media with embedded boundaries

    Get PDF
    Simulation of acoustic wave propagation in the Laplace?Fourier (LF) domain, with a spatially uniform mesh, can be computationally demanding especially in areas with large velocity contrasts. To improve efficiency and convergence, we use 3-D second- and fourth-order velocitypressure finite difference (FD) discontinuous meshes (DM). Our DM algorithm can use any spatial discretization ratio between meshes. We evaluate direct and iterative parallel solvers for computational speed, memory requirements and convergence. Benchmarks in realistic 3-D models and topographies show more efficient and stable results for DM with direct solvers than uniform mesh results with iterative solvers

    ELSI: A Unified Software Interface for Kohn-Sham Electronic Structure Solvers

    Full text link
    Solving the electronic structure from a generalized or standard eigenproblem is often the bottleneck in large scale calculations based on Kohn-Sham density-functional theory. This problem must be addressed by essentially all current electronic structure codes, based on similar matrix expressions, and by high-performance computation. We here present a unified software interface, ELSI, to access different strategies that address the Kohn-Sham eigenvalue problem. Currently supported algorithms include the dense generalized eigensolver library ELPA, the orbital minimization method implemented in libOMM, and the pole expansion and selected inversion (PEXSI) approach with lower computational complexity for semilocal density functionals. The ELSI interface aims to simplify the implementation and optimal use of the different strategies, by offering (a) a unified software framework designed for the electronic structure solvers in Kohn-Sham density-functional theory; (b) reasonable default parameters for a chosen solver; (c) automatic conversion between input and internal working matrix formats, and in the future (d) recommendation of the optimal solver depending on the specific problem. Comparative benchmarks are shown for system sizes up to 11,520 atoms (172,800 basis functions) on distributed memory supercomputing architectures.Comment: 55 pages, 14 figures, 2 table

    Parallel symbolic factorization for sparse LU with static pivoting

    No full text
    Abstract. This paper presents the design and implementation of a memory scalable parallel symbolic factorization algorithm for general sparse unsymmetric matrices. Our parallel algorithm uses a graph partitioning approach, applied to the graph of |A|+|A | T, to partition the matrix in such a way that is good for sparsity preservation as well as for parallel factorization. The partitioning yields a so-called separator tree which represents the dependencies among the computations. We use the separator tree to distribute the input matrix over the processors using a block cyclic approach and a subtree to sub-processor mapping. The parallel algorithm performs a bottom up traversal of the separator tree. With a combination of right-looking and left-looking partial factorizations, the algorithm obtains one column structure of L and one row structure of U at each step. The algorithm is implemented in C and MPI. From a performance study on large matrices, we show that the parallel algorithm significantly reduces the memory requirement of the symbolic factorization step, as well as the overall memory requirement of the parallel solver. It also often reduces the runtime of the sequential algorithm, which is already relatively small. In general, the parallel algorithm prevents the symbolic factorization step from being a time or memory bottleneck of the parallel solver. 1. Introduction. W

    Parallel Symbolic Factorization for Sparse LU with Static Pivoting

    No full text

    Development of Modal Analysis for the Study of Global Modes in High Speed Boundary Layer Flows

    Get PDF
    University of Minnesota Ph.D. dissertation. May 2017. Major: Aerospace Engineering and Mechanics. Advisor: Graham Candler. 1 computer file (PDF); x, 108 pages.Boundary layer transition for compressible flows remains a challenging and unsolved problem. In the context of high-speed compressible flow, transitional and turbulent boundary-layers produce significantly higher surface heating caused by an increase in skin-friction. The higher heating associated with transitional and turbulent boundary layers drives thermal protection systems (TPS) and mission trajectory bounds. Proper understanding of the mechanisms that drive transition is crucial to the successful design and operation of the next generation spacecraft. Currently, prediction of boundary-layer transition is based on experimental efforts and computational stability analysis. Computational analysis, anchored by experimen- tal correlations, offers an avenue to assess/predict stability at a reduced cost. Classi- cal methods of Linearized Stability Theory (LST) and Parabolized Stability Equations (PSE) have proven to be very useful for simple geometries/base flows. Under certain conditions the assumptions that are inherent to classical methods become invalid and the use of LST/PSE is inaccurate. In these situations, a global approach must be considered. A TriGlobal stability analysis code, Global Mode Analysis in US3D (GMAUS3D), has been developed and implemented into the unstructured solver US3D. A discussion of the methodology and implementation will be presented. Two flow configurations are presented in an effort to validate/verify the approach. First, stability analysis for a subsonic cylinder wake is performed and results compared to literature. Second, a supersonic blunt cone is considered to directly compare LST/PSE analysis and results generated by GMAUS3D
    corecore