208 research outputs found

    Lanczos eigensolution method for high-performance computers

    Get PDF
    The theory, computational analysis, and applications are presented of a Lanczos algorithm on high performance computers. The computationally intensive steps of the algorithm are identified as: the matrix factorization, the forward/backward equation solution, and the matrix vector multiples. These computational steps are optimized to exploit the vector and parallel capabilities of high performance computers. The savings in computational time from applying optimization techniques such as: variable band and sparse data storage and access, loop unrolling, use of local memory, and compiler directives are presented. Two large scale structural analysis applications are described: the buckling of a composite blade stiffened panel with a cutout, and the vibration analysis of a high speed civil transport. The sequential computational time for the panel problem executed on a CONVEX computer of 181.6 seconds was decreased to 14.1 seconds with the optimized vector algorithm. The best computational time of 23 seconds for the transport problem with 17,000 degs of freedom was on the the Cray-YMP using an average of 3.63 processors

    Computational methods and software systems for dynamics and control of large space structures

    Get PDF
    Two key areas of crucial importance to the computer-based simulation of large space structures are discussed. The first area involves multibody dynamics (MBD) of flexible space structures, with applications directed to deployment, construction, and maneuvering. The second area deals with advanced software systems, with emphasis on parallel processing. The latest research thrust in the second area involves massively parallel computers

    Viscous Shock Capturing in a Time-Explicit Discontinuous Galerkin Method

    Get PDF
    We present a novel, cell-local shock detector for use with discontinuous Galerkin (DG) methods. The output of this detector is a reliably scaled, element-wise smoothness estimate which is suited as a control input to a shock capture mechanism. Using an artificial viscosity in the latter role, we obtain a DG scheme for the numerical solution of nonlinear systems of conservation laws. Building on work by Persson and Peraire, we thoroughly justify the detector's design and analyze its performance on a number of benchmark problems. We further explain the scaling and smoothing steps necessary to turn the output of the detector into a local, artificial viscosity. We close by providing an extensive array of numerical tests of the detector in use.Comment: 26 pages, 21 figure

    Format Abstraction for Sparse Tensor Algebra Compilers

    Full text link
    This paper shows how to build a sparse tensor algebra compiler that is agnostic to tensor formats (data layouts). We develop an interface that describes formats in terms of their capabilities and properties, and show how to build a modular code generator where new formats can be added as plugins. We then describe six implementations of the interface that compose to form the dense, CSR/CSF, COO, DIA, ELL, and HASH tensor formats and countless variants thereof. With these implementations at hand, our code generator can generate code to compute any tensor algebra expression on any combination of the aforementioned formats. To demonstrate our technique, we have implemented it in the taco tensor algebra compiler. Our modular code generator design makes it simple to add support for new tensor formats, and the performance of the generated code is competitive with hand-optimized implementations. Furthermore, by extending taco to support a wider range of formats specialized for different application and data characteristics, we can improve end-user application performance. For example, if input data is provided in the COO format, our technique allows computing a single matrix-vector multiplication directly with the data in COO, which is up to 3.6×\times faster than by first converting the data to CSR.Comment: Presented at OOPSLA 201
    corecore