Search CORE

13 research outputs found

Puffin : A three dimensional, unaveraged free electron laser simulation code

Author: Campbell L. T.
McNeil B W. J.
Publication venue: JACoW
Publication date: 31/05/2013
Field of study

The broadband, 3D FEL code Puffin is presented. The analytical model is derived in absence of the Slowly Varying Envelope Approximation, and can model undulators of any polarisation. Due to the enhanced resolution, the memory and processing requirements are greater than equivalent unaveraged codes. The numerical code to solve the system of equations is therefore written for a parallel computing environment utilizing MPI. Some example simulations are presented

University of Strathclyde Institutional Repository

Efficient discontinuous finite difference meshes for 3-D Laplace-Fourier domain seismic wavefield modelling in acoustic media with embedded boundaries

Author: Alsalem HJ
Newman G
Petrov P
Rector J
Um E
Publication venue: eScholarship, University of California
Publication date: 26/07/2019
Field of study

Simulation of acoustic wave propagation in the Laplace?Fourier (LF) domain, with a spatially uniform mesh, can be computationally demanding especially in areas with large velocity contrasts. To improve efficiency and convergence, we use 3-D second- and fourth-order velocitypressure finite difference (FD) discontinuous meshes (DM). Our DM algorithm can use any spatial discretization ratio between meshes. We evaluate direct and iterative parallel solvers for computational speed, memory requirements and convergence. Benchmarks in realistic 3-D models and topographies show more efficient and stable results for DM with direct solvers than uniform mesh results with iterative solvers

Crossref

eScholarship - University of California

Recommended from our members

Final Report for UC Berkeley Terascale Optimal PDE Solvers TOPS DOE Award Number DE-FC02-01ER25478 9/15/2001 – 9/14/2006

Author: Demmel James
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 26/02/2007
Field of study

In many areas of science, physical experimentation may be too dangerous, too expensive or even impossible. Instead, large-scale simulations, validated by comparison with related experiments in well-understood laboratory contexts, are used by scientists to gain insight and confirmation of existing theories in such areas, without benefit of full experimental verification. The goal of the TOPS ISIC was to develop and implement algorithms and support scientific investigations performed by DOE-sponsored researchers. A major component of this effort is to provide software for large scale parallel computers capable of efficiently solving the enormous systems of equations arising from the nonlinear PDEs underlying these simulations. Several TOPS supported packages where designed in part (ScaLAPACK) or in whole (SuperLU) at Berkeley, and are widely used beyond SciDAC and DOE. Beyond continuing to develop these codes, our main effort focused on automatic performance tuning of the sparse matrix kernels (eg sparse-matrix-vector-multiply, or SpMV) at the core of many TOPS iterative solvers. Based on the observation that the fastest implementation of SpMV (and other kernels) can depend dramatically both on the computer and the matrix (the latter of which is not known until run-time), we developed and released a system called OSKI (Optimized Sparse Kernel Interface) that will automatically produce optimized version of SpMV (and other kernels), hiding complicated implementation details from the user. OSKI led to a 2x speedup in SpMV in a DOE accelerator design code, a 2x speedup in a commercial lithography simulation, and has been downloaded over 500 times. In addition to a stand-alone version, OSKI was also integrated into the TOPS-supported PETSc system

UNT Digital Library

ELSI: A Unified Software Interface for Kohn-Sham Electronic Structure Solvers

Author: Blum Volker
Corsetti Fabiano
García Alberto
Huhn William P.
Jacquelin Mathias
Jia Weile
Lange Björn
Lin Lin
Lu Jianfeng
Mi Wenhui
Seifitokaldani Ali
Vázquez-Mayagoitia Álvaro
Yang Chao
Yang Haizhao
Yu Victor Wen-zhe
Publication venue: 'Elsevier BV'
Publication date: 31/05/2017
Field of study

Solving the electronic structure from a generalized or standard eigenproblem is often the bottleneck in large scale calculations based on Kohn-Sham density-functional theory. This problem must be addressed by essentially all current electronic structure codes, based on similar matrix expressions, and by high-performance computation. We here present a unified software interface, ELSI, to access different strategies that address the Kohn-Sham eigenvalue problem. Currently supported algorithms include the dense generalized eigensolver library ELPA, the orbital minimization method implemented in libOMM, and the pole expansion and selected inversion (PEXSI) approach with lower computational complexity for semilocal density functionals. The ELSI interface aims to simplify the implementation and optimal use of the different strategies, by offering (a) a unified software framework designed for the electronic structure solvers in Kohn-Sham density-functional theory; (b) reasonable default parameters for a chosen solver; (c) automatic conversion between input and internal working matrix formats, and in the future (d) recommendation of the optimal solver depending on the specific problem. Comparative benchmarks are shown for system sizes up to 11,520 atoms (172,800 basis functions) on distributed memory supercomputing architectures.Comment: 55 pages, 14 figures, 2 table

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

Parallel symbolic factorization for sparse LU with static pivoting

Author: L. Grigori
Publication venue
Publication date: 01/01/2007
Field of study

Abstract. This paper presents the design and implementation of a memory scalable parallel symbolic factorization algorithm for general sparse unsymmetric matrices. Our parallel algorithm uses a graph partitioning approach, applied to the graph of |A|+|A | T, to partition the matrix in such a way that is good for sparsity preservation as well as for parallel factorization. The partitioning yields a so-called separator tree which represents the dependencies among the computations. We use the separator tree to distribute the input matrix over the processors using a block cyclic approach and a subtree to sub-processor mapping. The parallel algorithm performs a bottom up traversal of the separator tree. With a combination of right-looking and left-looking partial factorizations, the algorithm obtains one column structure of L and one row structure of U at each step. The algorithm is implemented in C and MPI. From a performance study on large matrices, we show that the parallel algorithm significantly reduces the memory requirement of the symbolic factorization step, as well as the overall memory requirement of the parallel solver. It also often reduces the runtime of the sequential algorithm, which is already relatively small. In general, the parallel algorithm prevents the symbolic factorization step from being a time or memory bottleneck of the parallel solver. 1. Introduction. W

CiteSeerX

Parallel Symbolic Factorization for Sparse LU with Static Pivoting

Author: James W. Demmel
Laura Grigori
Xiaoye S. Li
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date
Field of study

Crossref

Development of Modal Analysis for the Study of Global Modes in High Speed Boundary Layer Flows

Author: Brock Joseph
Publication venue
Publication date: 01/05/2017
Field of study

University of Minnesota Ph.D. dissertation. May 2017. Major: Aerospace Engineering and Mechanics. Advisor: Graham Candler. 1 computer file (PDF); x, 108 pages.Boundary layer transition for compressible flows remains a challenging and unsolved problem. In the context of high-speed compressible flow, transitional and turbulent boundary-layers produce significantly higher surface heating caused by an increase in skin-friction. The higher heating associated with transitional and turbulent boundary layers drives thermal protection systems (TPS) and mission trajectory bounds. Proper understanding of the mechanisms that drive transition is crucial to the successful design and operation of the next generation spacecraft. Currently, prediction of boundary-layer transition is based on experimental efforts and computational stability analysis. Computational analysis, anchored by experimen- tal correlations, offers an avenue to assess/predict stability at a reduced cost. Classi- cal methods of Linearized Stability Theory (LST) and Parabolized Stability Equations (PSE) have proven to be very useful for simple geometries/base flows. Under certain conditions the assumptions that are inherent to classical methods become invalid and the use of LST/PSE is inaccurate. In these situations, a global approach must be considered. A TriGlobal stability analysis code, Global Mode Analysis in US3D (GMAUS3D), has been developed and implemented into the unstructured solver US3D. A discussion of the methodology and implementation will be presented. Two flow configurations are presented in an effort to validate/verify the approach. First, stability analysis for a subsonic cylinder wake is performed and results compared to literature. Second, a supersonic blunt cone is considered to directly compare LST/PSE analysis and results generated by GMAUS3D

University of Minnesota Digital Conservancy