40 research outputs found

    Structural dynamics branch research and accomplishments for fiscal year 1987

    Get PDF
    This publication contains a collection of fiscal year 1987 research highlights from the Structural Dynamics Branch at NASA Lewis Research Center. Highlights from the branch's four major work areas, Aeroelasticity, Vibration Control, Dynamic Systems, and Computational Structural Methods, are included in the report as well as a complete listing of the FY87 branch publications

    Parallel numerical methods for analysing optical devices with the BPM

    Get PDF
    In this work, some developments in the theory of modelling integrated optical devices are discussed. The theory of the Beam Propagation Method (BPM) to analyse longitudinal optical waveguides is established. The BPM is then formulated and implemented numerically to study both two and three-dimensional optical waveguides using several Finite-Difference (FD) techniques. For the 2-D analysis, comparisons between the performance of the implicit Crank Nicholson (CN), the explicit Real Space (RS) and the Explicit Finite-Difference (EFD) are made through systematic tests on slab waveguide geometries. For three-dimensional applications, two explicit highly-parallel three-dimensional FD-BPMs (the RS and the EFD) have been implemented on two different parallel computers, namely a transputer array (MIMD type) and a Connection Machine (SIMD type). To assess the performance of parallel computers in this context, serial computer codes for the two methods have been implemented and a comparison between the speed of the serial and parallel codes has been made. Large gains in the speed of the parallel FD-BPMs have been obtained compared to the serial implementations; both methods, in their parallel form, can execute, per propagational step, a large problem containing 106 discretisation points in a few seconds. In addition, a comparison between the performance of the transputer array and the Connection Machine in executing the two FD-BPMs has been discussed. To assess and compare the two methods, three different rib waveguides and three different directional couplers have been analysed and the results compared with published results. It has been concluded from testing these methods that the parallel EFD-BPM is more efficient than the parallel RS-BPM. Then, the linear parallel EFD-BPM was extended to model nonlinear second harmonic generation process in three-dimensional waveguides, where the source field is allowed to deplete, using the transputer array and the Connection Machine

    Parallel algorithms for the solution of elliptic and parabolic problems on transputer networks

    Get PDF
    This thesis is a study of the implementation of parallel algorithms for solving elliptic and parabolic partial differential equations on a network of transputers. The thesis commences with a general introduction to parallel processing. Here a discussion of the various ways of introducing parallelism in computer systems and the classification of parallel architectures is presented. In chapter 2, the transputer architecture and the associated language OCCAM are described. The transputer development system (TDS) is also described as well as a short account of other transputer programming languages. Also, a brief description of the methodologies for programming transputer networks is given. The chapter is concluded by a detailed description of the hardware used for the research. [Continues.

    Structural dynamics branch research and accomplishments to FY 1992

    Get PDF
    This publication contains a collection of fiscal year 1992 research highlights from the Structural Dynamics Branch at NASA LeRC. Highlights from the branch's major work areas--Aeroelasticity, Vibration Control, Dynamic Systems, and Computational Structural Methods are included in the report as well as a listing of the fiscal year 1992 branch publications

    Parallel Algorithms for the Solution of the Schrodinger Equation

    Get PDF
    Many of the traditional numerical algorithms do not map easily onto the architecture of parallel computers that have emerged recently. For the economic use of these expensive machines and to reduce the total computing time, it is necessary to develop efficient parallel algorithms. The purpose of the thesis is to develop several parallel algorithms for the numerical solution of the Schrodinger equation which arises in many branches of atomic and molecular physics. Common models of systems which are of interest may represent stable configurations of two particles, the bound state or eigenvalue problem. Alternately one may consider either singlechannel or multi-channel scattering. All three mathematical models will be investigated in this work. Emphasis is placed on parallel algorithms for MIMD machines. All the algorithms have been implemented and tested on a transputer network which is a MIMD machine without shared memory. Existing numerical methods such as those ascribed to Numerov and De Vogelaere have been investigated and parallel versions of them have been developed. Two exponentially fitted versions of the De Vogelacre algorithm have been developed and they are found to be more efficient than the normal De Vogelaere algorithm

    Lewis Structures Technology, 1988. Volume 1: Structural Dynamics

    Get PDF
    The specific purpose of the symposium was to familiarize the engineering structures community with the depth and range of research performed by the Structures Division of the Lewis Research Center and its academic and industrial partners. Sessions covered vibration control, fracture mechanics, ceramic component reliability, parallel computing, nondestructive testing, dynamical systems, fatigue and damage, wind turbines, hot section technology, structural mechanics codes, computational methods for dynamics, structural optimization, and applications of structural dynamics

    The application of parallel computer technology to the dynamic analysis of suspension bridges

    Get PDF
    This research is concerned with the application of distributed computer technology to the solution of non-linear structural dynamic problems, in particular the onset of aerodynamic instabilities in long span suspension bridge structures, such as flutter which is a catastrophic aeroelastic phenomena. The thesis is set out in two distinct parts:- Part I, presents the theoretical background of the main forms of aerodynamic instabilities, presenting in detail the main solution techniques used to solve the flutter problem. The previously written analysis package ANSUSP is presented which has been specifically developed to predict numerically the onset of flutter instability. The various solution techniques which were employed to predict the onset of flutter for the Severn Bridge are discussed. All the results presented in Part I were obtained using a 486DX2 66MHz serial personal computer. Part II, examines the main solution techniques in detail and goes on to apply them to a large distributed supercomputer, which allows the solution of the problem to be achieved considerably faster than is possible using the serial computer system. The solutions presented in Part II are represented as Performance Indices (PI) which quote the ratio of time to performing a specific calculation using a serial algorithm compared to a parallel algorithm running on the same computer system

    Parallel Iterative Solution Methods for Linear Systems arising from Discretized PDE's

    Get PDF
    In these notes we will present an overview of a number of related iterative methods for the solution of linear systems of equations. These methods are so-called Krylov projection type methods and the include popular methods as Conjugate Gradients, Bi-Conjugate Gradients, CGST Bi-CGSTAB, QMR, LSQR and GMRES. We will show how these methods can be derived from simple basic iteration formulas. We will not give convergence proofs, but we will refer for these, as far as available, to litterature. Iterative methods are often used in combination with so-called preconditioning operators (approximations for the inverses of the operator of the system to be solved). Since these preconditions are not essential in the derivation of the iterative methods, we will not give much attention to them in these notes. However, in most of the actual iteration schemes, we have included them in order to facilitate the use of these schemes in actual computations. For the application of the iterative schemes one usually thinks of linear sparse systems, e.g., like those arising in the finite element or finite difference approximatious of (systems of) partial differential equations. However, the structure of the operators plays no explicit role in any of these schemes, and these schemes might also successfully be used to solve certain large dense linear systems. Depending on the situation that might be attractive in terms of numbers of floating point operations. It will turn out that all of the iterative are parallelizable in a straight forward manner. However, especially for computers with a memory hierarchy (i.e. like cache or vector registers), and for distributed memory computers, the performance can often be improved significantly through rescheduling of the operations. We will discuss parallel implementations, and occasionally we will report on experimental findings

    Singular value computations on the AP1000 array computer

    No full text
    The increasing popularity of singular value decomposition algorithms, used as a tool in many areas of science and engineering, demands a rapid development of their fast and reliable implementations. No longer are those implementations bounded to the single processor environment since more and more parallel computers are available on the market. This situation requires that often software need to be re-implemented on those new parallel architectures efficiently. In this thesis we show, on the example of a singular value decomposition algorithm, how this task of changing the working environments can be accomplished with non-trivial gains in performance. We show several optimisation techniques and their impact on the algorithm performance on all parallel memory hierarchy levels (register, cache, main memory and external processor memory levels). The central principle in all of the optimisations presented herein is to increase the number of columns (column-segments) being held in each level of the memory hierarchy and therefore increase the data reuse factors. In the optimisations for the parallel memory hierarchy the techniques used are, rectangular processor configuration, partitioning, and four-column rotation. The rectangular processor configuration technique is where the data were mapped onto a rectangular network of processors instead of a linear one. This technique improves the communication and cache performance such that on average, we reduced the execution time by a factor of 2 and, in the case of long column-segments, by a factor of 5. The partitioning technique involves rearranging data and the order of computations in the cells. This technique increases the cache hit ratio for large matrices. For the relatively modest improvements in the cache performance of 2 to 5%, we achieved a significant reduction in the execution times of 10 to 20%. The four-column rotation technique improves the performance by a better register reuse. For the cases of large number of columns stored per processor, this technique gave 2 to 10% improvement in execution time over the 'classic', two column rotation. Apart from the optimisations on the memory hierarchy levels, several floating point optimisations are presented on the algorithm itself which can be applied in any architecture. The main ideas behind those optimisations are the reduction of the number of floating point instructions executed in a unit of time and the balance of the floating point operations. This was accomplished by reshaping the relevant parts of the code to use the APlOOO processors architecture (SPARC) to its full potential. After combining all of the optimisations, we achieved a sustained 60% reduction of the execution time which corresponds to the 2.5 fold reduction. In the cases where long columns of the input matrix were used, we achieved nearly 5 fold reduction in execution time without adversely affecting the accuracy of the singular values and maintaining the quadratic convergence of the algorithm. The algorithm was implemented on the Fujitsu's APlOOO Array Multiprocessor, but all optimisations described can be easily applied to any MIMD architecture with a mesh or hypercube topology, and all but one can be applied to register-cache uniprocessors also. Despite many changes in the structure of the algorithm we found that the convergence was not adversely affected and the accuracy of the orthogonalisation was no worse than for the uniprocessor implementation of the noted SVD algorithm

    The Parallel Glasgow Shell Model Code and Applications

    Get PDF
    This thesis describes the further development of the Glasgow Shell Model code, following on from the thesis of Dr. Mohammed Riaz. In that work, a possible parallel development of the Glasgow code was discussed, and a simplified version of the code constructed which could only run on three processors. Rather than immediately continue in this direction, we felt it would be worthwhile to investigate all of the possible ways that the code could be implemented in parallel, or that the current parallel version of the code could be made faster. Various models of the code were used to arrive at an implementation which is best able to satisfy our expectations for the parallelized version of the program. The development of this code is then described, showing how the code was written to be at once as optimal and as portable as possible, to take account of future architectures that may become available to us, and discussing problems that arise in doing large shell model calculations (in parallel or not). We go on to describe some applications of the new code, specifically to the termination of rotational bands in light sd-shell nuclei, and some original calculations in the cranked shell-model description of the nucleus, with a deformed basis being used. Finally the alternatives to the present work are described, showing where the present shell model code fits into the panoply of nuclear models for large-basis calculations
    corecore