19 research outputs found

    Data Structures and Algorithms for Efficient Solution of Simultaneous Linear Equations from 3-D Ice Sheet Models

    Get PDF
    Two current software packages for solving large systems of sparse simultaneous l~neare equations are evaluated in terms of their applicability to solving systems of equations generated by the University of Maine Ice Sheet Model. SuperLU, the first package, has been developed by researchers at the University of California at Berkeley and the Lawrence Berkeley National Laboratory. UMFPACK, the second package, has been developed by T. A. Davis of the University of Florida who has ties with the U. C. Berkeley researchers as well as European researchers. Both packages are direct solvers that use LU factorization with forward and backward substitution. The University of Maine Ice Sheet Model uses the finite element method to solve partial differential equations that describe ice thickness, velocity,and temperature throughout glaciers as functions of position and t~me. The finite element method generates systems of linear equations having tens of thousands of variables and one hundred or so non-zero coefficients per equation. Matrices representing these systems of equations may be strictly banded or banded with right and lower borders. In order to efficiently Interface the software packages with the ice sheet model, a modified compressed column data structure and supporting routines were designed and written. The data structure interfaces directly with both software packages and allows the ice sheet model to access matrix coefficients by row and column number in roughly 100 nanoseconds while only storing non-zero entries of the matrix. No a priori knowledge of the matrix\u27s sparsity pattern is required. Both software packages were tested with matrices produced by the model and performance characteristics were measured arid compared with banded Gaussian elimination. When combined with high performance basic linear algebra subprograms (BLAS), the packages are as much as 5 to 7 times faster than banded Gaussian elimination. The BLAS produced by K. Goto of the University of Texas was used. Memory usage by the packages varted from slightly more than banded Gaussian elimination with UMFPACK, to as much as a 40% savings with SuperLU. In addition, the packages provide componentwise backward error measures and estimates of the matrix\u27s condition number. SuperLU is available for parallel computers as well as single processor computers. UMPACK is only for single processor computers. Both packages are also capable of efficiently solving the bordered matrix problem

    The numerical solution of sparse matrix equations by fast methods and associated computational techniques

    Get PDF
    The numerical solution of sparse matrix equations by fast methods and associated computational technique

    Computation of eigenvectors of block tridiagonal matrices based on twisted factorizations

    Get PDF
    Die Berechnung von Eigenwerten und Eigenvektoren von blocktridiagonalen Matrizen und Bandmatrizen stellt einen gewichtigen Aspekt von zahlreichen Anwendungen aus dem Scientific Computing dar. Bisherige Algorithmen zur Bestimmung von Eigenvektoren in solchen Matrizen basierten zumeist auf einer vorhergehenden Tridiagonalisierung der Matrix. Obwohl die Bestimmung von Eigenwerten und Eigenvektoren in tridiagonalen Matrizen sehr effizient durchgeführt werden kann, ist der notwendige Tridiagonalisierungsprozess jedoch sehr rechenintensiv. Des weiteren benötigen zahlreiche Methoden noch Maßnahmen zur Sicherstellung der Orthogonalität der resultierenden Eigenvektoren, was eine zusätzliche Bürde für die Rechenleistung darstellt. In dieser Arbeit wird eine neue Methode zur Berechnung von Eigenvektoren in blocktridiagonalen Matrizen untersucht, die im Wesentlichen auf der Verwendung von Twisted Factorizations beruht. Hierfür werden die grundlegenden Prinzipien eines Algorithmus zur Berechnung von geblockten Twisted Factorizations von blocktridiagonalen Matrizen erläutert. Des weiteren werden einige interessante Eigenschaften von Twisted Factorizations aufgezeigt, sowie die Beziehung des Blocks, bei dem sich die Faktorisierungen treffen, zu einem Eigenvektor erklärt. Diese Beziehung kann zur effizienten Bestimmung von Eigenvektoren herangezogen werden. Im Gegensatz zu bisherigen Methoden ist der hier vorgestellte Algorithmus nicht auf eine Reduktion zur tridiagonalen Form angewiesen und beinhaltet nur einen einzigen Schritt der inversen Iteration. Dies wird durch das Auffinden eines Startvektors, der das Residuum des Eigenpaares minimiert, ermöglicht. Einer der Hauptpunkte dieser Arbeit ist daher die Evaluierung verschiedener Strategien zur Selektion eines geeigneten Startvektors. Des weiteren werden im Rahmen dieser Arbeit Daten zur Genauigkeit, Orthogonalität und des Zeitverhaltens einer Computerimplementation des neuen Algorithmus vorgestellt und mit gängigen Methoden verglichen. Die gewonnenen Daten zeigen nicht nur, daß der Algorithmus Eigenvektoren mit sehr geringen Residuen zurückliefert, sondern auch bei der Berechnung von Eigenvektoren in großen Matrizen und/oder Matrizen mit geringer Bandbreite effizienter ist. Aufgrund seiner Struktur und dem inhärenten Parallelisierungspotential ist der neue Algorithmus hervorragend dazu geeignet, moderne und zukünftige Hardware auszunutzen, welche von einem hohen Maß an Nebenläufigkeit geprägt sind.Computing the eigenvalues and eigenvectors of a band or block tridiagonal matrix is an important aspect of various applications in Scientific Computing. Most existing algorithms for computing eigenvectors of a band matrix rely on a prior tridiagonalization of the matrix. While the eigenvalues and eigenvectors of tridiagonal matrices can be computed very efficiently, the preceding tridiagonalization process can be relatively costly. Moreover, many eigensolvers require additional measures to ensure the orthogonality of the computed eigenvectors, which constitutes a significant computational expense. In this thesis we explore a new method for computing eigenvectors of block tridiagonal matrices based on twisted factorizations. We describe the basic principles of an algorithm for computing block twisted factorizations of block tridiagonal matrices. We also show some interesting properties of these twisted factorizations and investigate the relation of the block, where the factorizations meet, to an eigenvector of the block tridiagonal matrix. This relation can be exploited to compute the eigenvector in a very efficient way. Contrary to most conventional techniques, our algorithm for the determination of eigenvectors does not require a reduction of the matrix to tridiagonal form, and attempts to compute a good eigenvector approximation with only a single step of inverse iteration. This idea is based on finding a starting vector for inverse iteration which minimizes the residual of the resulting eigenpair. One of the main contributions of this thesis is the investigation and evaluation of different strategies for the selection of a suitable starting vector. Furthermore, we present experimental data for the accuracy, orthogonality and runtime behavior of an implementation of the new algorithm, and compare these results with existing methods. Our results show that our new algorithm returns eigenvectors with very low residuals, while being more efficient in terms of computational costs for large matrices and/or for small bandwidths. Due to its structure and inherent parallelization potential, the new algorithm is also well suited for exploiting modern and future hardware, which are characterized by a high degree of concurrency

    Solution of partial differential equations on vector and parallel computers

    Get PDF
    The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed

    Explicit alternating direction methods for problems in fluid dynamics

    Get PDF
    Recently an iterative method was formulated employing a new splitting strategy for the solution of tridiagonal systems of difference equations. The method was successful in solving the systems of equations arising from one dimensional initial boundary value problems, and a theoretical analysis for proving the convergence of the method for systems whose constituent matrices are positive definite was presented by Evans and Sahimi [22]. The method was known as the Alternating Group Explicit (AGE) method and is referred to as AGE-1D. The explicit nature of the method meant that its implementation on parallel machines can be very promising. The method was also extended to solve systems arising from two and three dimensional initial-boundary value problems, but the AGE-2D and AGE-3D algorithms proved to be too demanding in computational cost which largely reduces the advantages of its parallel nature. In this thesis, further theoretical analyses and experimental studies are pursued to establish the convergence and suitability of the AGE-1D method to a wider class of systems arising from univariate and multivariate differential equations with symmetric and non symmetric difference operators. Also the possibility of a Chebyshev acceleration of the AGE-1D algorithm is considered. For two and three dimensional problems it is proposed to couple the use of the AGE-1D algorithm with an ADI scheme or an ADI iterative method in what is called the Explicit Alternating Direction (EAD) method. It is then shown through experimental results that the EAD method retains the parallel features of the AGE method and moreover leads to savings of up to 83 % in the computational cost for solving some of the model problems. The thesis also includes applications of the AGE-1D algorithm and the EAD method to solve some problems of fluid dynamics such as the linearized Shallow Water equations, and the Navier Stokes' equations for the flow in an idealized one dimensional Planetary Boundary Layer. The thesis terminates with conclusions and suggestions for further work together with a comprehensive bibliography and an appendix containing some selected programs

    Application of HPC in eddy current electromagnetic problem solution

    Get PDF
    As engineering problems are becoming more and more advanced, the size of an average model solved by partial differential equations is rapidly growing and, in order to keep simulation times within reasonable bounds, both faster computers and more efficient software implementations are needed. In the first part of this thesis, the full potential of simulation software has been exploited through high performance parallel computing techniques. In particular, the simulation of induction heating processes is accomplished within reasonable solution times, by implementing different parallel direct solvers for large sparse linear system, in the solution process of a commercial software. The performance of such library on shared memory systems has been remarkably improved by implementing a multithreaded version of MUMPS (MUltifrontal Massively Parallel Solver) library, which have been tested on benchmark matrices arising from typical induction heating process simulations. A new multithreading approach and a low rank approximation technique have been implemented and developed by MUMPS team in Lyon and Toulouse. In the context of a collaboration between MUMPS team and DII-University of Padova, a preliminary version of such functionalities could be tested on induction heating benchmark problems, and a substantial reduction of the computational cost and memory requirements could be achieved. In the second part of this thesis, some examples of design methodology by virtual prototyping have been described. Complex multiphysics simulations involving electromagnetic, circuital, thermal and mechanical problems have been performed by exploiting parallel solvers, as developed in the first part of this thesis. Finally, multiobjective stochastic optimization algorithms have been applied to multiphysics 3D model simulations in search of a set of improved induction heating device configurations

    A New Method for Efficient Parallel Solution of Large Linear Systems on a SIMD Processor.

    Get PDF
    This dissertation proposes a new technique for efficient parallel solution of very large linear systems of equations on a SIMD processor. The model problem used to investigate both the efficiency and applicability of the technique was of a regular structure with semi-bandwidth β,\beta, and resulted from approximation of a second order, two-dimensional elliptic equation on a regular domain under the Dirichlet and periodic boundary conditions. With only slight modifications, chiefly to properly account for the mathematical effects of varying bandwidths, the technique can be extended to encompass solution of any regular, banded systems. The computational model used was the MasPar MP-X (model 1208B), a massively parallel processor hostnamed hurricane and housed in the Concurrent Computing Laboratory of the Physics/Astronomy department, Louisiana State University. The maximum bandwidth which caused the problem\u27s size to fit the nyproc ×\times nxproc machine array exactly, was determined. This as well as smaller sizes were used in four experiments to evaluate the efficiency of the new technique. Four benchmark algorithms, two direct--Gauss elimination (GE), Orthogonal factorization--and two iterative--symmetric over-relaxation (SOR) (ω\omega = 2), the conjugate gradient method (CG)--were used to test the efficiency of the new approach based upon three evaluation metrics--deviations of results of computations, measured as average absolute errors, from the exact solution, the cpu times, and the mega flop rates of executions. All the benchmarks, except the GE, were implemented in parallel. In all evaluation categories, the new approach outperformed the benchmarks and very much so when N \gg p, p being the number of processors and N the problem size. At the maximum system\u27s size, the new method was about 2.19 more accurate, and about 1.7 times faster than the benchmarks. But when the system size was a lot smaller than the machine\u27s size, the new approach\u27s performance deteriorated precipitously, and, in fact, in this circumstance, its performance was worse than that of GE, the serial code. Hence, this technique is recommended for solution of linear systems with regular structures on array processors when the problem\u27s size is large in relation to the processor\u27s size

    The NASTRAN theoretical manual

    Get PDF
    Designed to accommodate additions and modifications, this commentary on NASTRAN describes the problem solving capabilities of the program in a narrative fashion and presents developments of the analytical and numerical procedures that underlie the program. Seventeen major sections and numerous subsections cover; the organizational aspects of the program, utility matrix routines, static structural analysis, heat transfer, dynamic structural analysis, computer graphics, special structural modeling techniques, error analysis, interaction between structures and fluids, and aeroelastic analysis
    corecore