13 research outputs found
Puffin : A three dimensional, unaveraged free electron laser simulation code
The broadband, 3D FEL code Puffin is presented. The analytical model is derived in absence of the Slowly Varying Envelope Approximation, and can model undulators of any polarisation. Due to the enhanced resolution, the memory and processing requirements are greater than equivalent unaveraged codes. The numerical code to solve the system of equations is therefore written for a parallel computing environment utilizing MPI. Some example simulations are presented
Efficient discontinuous finite difference meshes for 3-D Laplace-Fourier domain seismic wavefield modelling in acoustic media with embedded boundaries
Simulation of acoustic wave propagation in the Laplace?Fourier (LF) domain, with a spatially uniform mesh, can be computationally demanding especially in areas with large velocity contrasts. To improve efficiency and convergence, we use 3-D second- and fourth-order velocitypressure finite difference (FD) discontinuous meshes (DM). Our DM algorithm can use any spatial discretization ratio between meshes. We evaluate direct and iterative parallel solvers for computational speed, memory requirements and convergence. Benchmarks in realistic 3-D models and topographies show more efficient and stable results for DM with direct solvers than uniform mesh results with iterative solvers
Recommended from our members
Final Report for UC Berkeley Terascale Optimal PDE Solvers TOPS DOE Award Number DE-FC02-01ER25478 9/15/2001 – 9/14/2006
In many areas of science, physical experimentation may be too dangerous, too expensive or even impossible. Instead, large-scale simulations, validated by comparison with related experiments in well-understood laboratory contexts, are used by scientists to gain insight and confirmation of existing theories in such areas, without benefit of full experimental verification. The goal of the TOPS ISIC was to develop and implement algorithms and support scientific investigations performed by DOE-sponsored researchers. A major component of this effort is to provide software for large scale parallel computers capable of efficiently solving the enormous systems of equations arising from the nonlinear PDEs underlying these simulations. Several TOPS supported packages where designed in part (ScaLAPACK) or in whole (SuperLU) at Berkeley, and are widely used beyond SciDAC and DOE. Beyond continuing to develop these codes, our main effort focused on automatic performance tuning of the sparse matrix kernels (eg sparse-matrix-vector-multiply, or SpMV) at the core of many TOPS iterative solvers. Based on the observation that the fastest implementation of SpMV (and other kernels) can depend dramatically both on the computer and the matrix (the latter of which is not known until run-time), we developed and released a system called OSKI (Optimized Sparse Kernel Interface) that will automatically produce optimized version of SpMV (and other kernels), hiding complicated implementation details from the user. OSKI led to a 2x speedup in SpMV in a DOE accelerator design code, a 2x speedup in a commercial lithography simulation, and has been downloaded over 500 times. In addition to a stand-alone version, OSKI was also integrated into the TOPS-supported PETSc system
ELSI: A Unified Software Interface for Kohn-Sham Electronic Structure Solvers
Solving the electronic structure from a generalized or standard eigenproblem
is often the bottleneck in large scale calculations based on Kohn-Sham
density-functional theory. This problem must be addressed by essentially all
current electronic structure codes, based on similar matrix expressions, and by
high-performance computation. We here present a unified software interface,
ELSI, to access different strategies that address the Kohn-Sham eigenvalue
problem. Currently supported algorithms include the dense generalized
eigensolver library ELPA, the orbital minimization method implemented in
libOMM, and the pole expansion and selected inversion (PEXSI) approach with
lower computational complexity for semilocal density functionals. The ELSI
interface aims to simplify the implementation and optimal use of the different
strategies, by offering (a) a unified software framework designed for the
electronic structure solvers in Kohn-Sham density-functional theory; (b)
reasonable default parameters for a chosen solver; (c) automatic conversion
between input and internal working matrix formats, and in the future (d)
recommendation of the optimal solver depending on the specific problem.
Comparative benchmarks are shown for system sizes up to 11,520 atoms (172,800
basis functions) on distributed memory supercomputing architectures.Comment: 55 pages, 14 figures, 2 table
Parallel symbolic factorization for sparse LU with static pivoting
Abstract. This paper presents the design and implementation of a memory scalable parallel symbolic factorization algorithm for general sparse unsymmetric matrices. Our parallel algorithm uses a graph partitioning approach, applied to the graph of |A|+|A | T, to partition the matrix in such a way that is good for sparsity preservation as well as for parallel factorization. The partitioning yields a so-called separator tree which represents the dependencies among the computations. We use the separator tree to distribute the input matrix over the processors using a block cyclic approach and a subtree to sub-processor mapping. The parallel algorithm performs a bottom up traversal of the separator tree. With a combination of right-looking and left-looking partial factorizations, the algorithm obtains one column structure of L and one row structure of U at each step. The algorithm is implemented in C and MPI. From a performance study on large matrices, we show that the parallel algorithm significantly reduces the memory requirement of the symbolic factorization step, as well as the overall memory requirement of the parallel solver. It also often reduces the runtime of the sequential algorithm, which is already relatively small. In general, the parallel algorithm prevents the symbolic factorization step from being a time or memory bottleneck of the parallel solver. 1. Introduction. W
Development of Modal Analysis for the Study of Global Modes in High Speed Boundary Layer Flows
University of Minnesota Ph.D. dissertation. May 2017. Major: Aerospace Engineering and Mechanics. Advisor: Graham Candler. 1 computer file (PDF); x, 108 pages.Boundary layer transition for compressible flows remains a challenging and unsolved problem. In the context of high-speed compressible flow, transitional and turbulent boundary-layers produce significantly higher surface heating caused by an increase in skin-friction. The higher heating associated with transitional and turbulent boundary layers drives thermal protection systems (TPS) and mission trajectory bounds. Proper understanding of the mechanisms that drive transition is crucial to the successful design and operation of the next generation spacecraft. Currently, prediction of boundary-layer transition is based on experimental efforts and computational stability analysis. Computational analysis, anchored by experimen- tal correlations, offers an avenue to assess/predict stability at a reduced cost. Classi- cal methods of Linearized Stability Theory (LST) and Parabolized Stability Equations (PSE) have proven to be very useful for simple geometries/base flows. Under certain conditions the assumptions that are inherent to classical methods become invalid and the use of LST/PSE is inaccurate. In these situations, a global approach must be considered. A TriGlobal stability analysis code, Global Mode Analysis in US3D (GMAUS3D), has been developed and implemented into the unstructured solver US3D. A discussion of the methodology and implementation will be presented. Two flow configurations are presented in an effort to validate/verify the approach. First, stability analysis for a subsonic cylinder wake is performed and results compared to literature. Second, a supersonic blunt cone is considered to directly compare LST/PSE analysis and results generated by GMAUS3D