273 research outputs found

    The impact of fourth generation computers on NASTRAN

    Get PDF
    The impact of 'fourth generation' computers (STAR 100 or ILLIAC 4) on NASTRAN is considered. The desired characteristics of large programs designed for execution on 4G machines are described

    Architecture independent environment for developing engineering software on MIMD computers

    Get PDF
    Engineers are constantly faced with solving problems of increasing complexity and detail. Multiple Instruction stream Multiple Data stream (MIMD) computers have been developed to overcome the performance limitations of serial computers. The hardware architectures of MIMD computers vary considerably and are much more sophisticated than serial computers. Developing large scale software for a variety of MIMD computers is difficult and expensive. There is a need to provide tools that facilitate programming these machines. First, the issues that must be considered to develop those tools are examined. The two main areas of concern were architecture independence and data management. Architecture independent software facilitates software portability and improves the longevity and utility of the software product. It provides some form of insurance for the investment of time and effort that goes into developing the software. The management of data is a crucial aspect of solving large engineering problems. It must be considered in light of the new hardware organizations that are available. Second, the functional design and implementation of a software environment that facilitates developing architecture independent software for large engineering applications are described. The topics of discussion include: a description of the model that supports the development of architecture independent software; identifying and exploiting concurrency within the application program; data coherence; engineering data base and memory management

    Algorithms and software for solving finite element equations on serial and parallel architectures

    Get PDF
    The primary objective was to compare the performance of state-of-the-art techniques for solving sparse systems with those that are currently available in the Computational Structural Mechanics (MSC) testbed. One of the first tasks was to become familiar with the structure of the testbed, and to install some or all of the SPARSPAK package in the testbed. A brief overview of the CSM Testbed software and its usage is presented. An overview of the sparse matrix research for the Testbed currently employed in the CSM Testbed is given. An interface which was designed and implemented as a research tool for installing and appraising new matrix processors in the CSM Testbed is described. The results of numerical experiments performed in solving a set of testbed demonstration problems using the processor SPK and other experimental processors are contained

    Sparse matrix methods research using the CSM testbed software system

    Get PDF
    Research is described on sparse matrix techniques for the Computational Structural Mechanics (CSM) Testbed. The primary objective was to compare the performance of state-of-the-art techniques for solving sparse systems with those that are currently available in the CSM Testbed. Thus, one of the first tasks was to become familiar with the structure of the testbed, and to install some or all of the SPARSPAK package in the testbed. A suite of subroutines to extract from the data base the relevant structural and numerical information about the matrix equations was written, and all the demonstration problems distributed with the testbed were successfully solved. These codes were documented, and performance studies comparing the SPARSPAK technology to the methods currently in the testbed were completed. In addition, some preliminary studies were done comparing some recently developed out-of-core techniques with the performance of the testbed processor INV

    A study of sampling, granularity and localities in program restructuring

    Get PDF
    Program restructuring is a method to reduce the cost of program execution by improving the locality of the program\u27s reference behavior. Three aspects of program restructuring (sampling, granularity, localities) are studied in this research. The study of first aspect, sampling, shows that the high cost of a posteriori restructuring can be reduced considerably by a program restructuring method based on sampled reference strings rather than on complete reference string;The second aspect is granularity. Based on the studies of two different block sizes (the basic-block and the procedure block), it is found that the performance of restructuring using smaller blocks is not necessarily better;Finally, a new strategy-independent restructuring method, using both the critical and locality principles, is found to be more effective than any other existing restructuring methods;Results of measurements of paging performance obtained in the experiments are discussed. Both fixed-space and variable-space paging policies are considered

    Multilayered Heterogeneous Parallelism Applied to Atmospheric Constituent Transport Simulation

    Get PDF
    Heterogeneous multicore chipsets with many levels of parallelism are becoming increasingly common in high-performance computing systems. Effective use of parallelism in these new chipsets constitutes the challenge facing a new generation of large scale scientific computing applications. This study examines methods for improving the performance of two-dimensional and three-dimensional atmospheric constituent transport simulation on the Cell Broadband Engine Architecture (CBEA). A function offloading approach is used in a 2D transport module, and a vector stream processing approach is used in a 3D transport module. Two methods for transferring incontiguous data between main memory and accelerator local storage are compared. By leveraging the heterogeneous parallelism of the CBEA, the 3D transport module achieves performance comparable to two nodes of an IBM BlueGene/P, or eight Intel Xeon cores, on a single PowerXCell 8i chip. Module performance on two CBEA systems, an IBM BlueGene/P, and an eight-core shared-memory Intel Xeon workstation are given

    Efficacy of B-Trees in an Information Storage and Retrieval Environment

    Get PDF
    This study investigates the efficacy of B-trees in an information storage and retrieval environment. A practical information storage and retrieval system is developed and used to test the performance of B-trees.Computing and Information Science

    Synthesis heuristics for large asynchronous sequential circuits

    Get PDF
    Many well-known synthesis procedures for asynchronous sequential circuits produce minimal or near-minimal results, but are practical only for very small problems. These algorithms become unwieldy when applied to large circuits with, for example, three or more input variables and twenty or more internal states. New heuristic procedures are described which permit the synthesis of very large machines. Although the resulting designs are generally not minimal, the heuristics are able to produce near-minimal solutions orders of magnitude more rapidly than the minimal algorithms. A method for specifying sequential circuit behavior is presented. Input-output sequences define submachines or modules. When properly interconnected, these modules form the required sequential circuit. It is shown that the waveform and interconnection specifications may easily be translated into flow table form. A large flow table simplification heuristic is developed. The algorithm may be applied to tables having hundreds of rows, and handles both normal and non-normal mode circuit specifications. Nonstandard state assignment procedures for normal, fundamental mode asynchronous sequential circuits are examined. An algorithm for rapidly generating large flow table internal state assignments is proposed. The algorithms described have been programmed in PL/1 and incorporated into an automated design system for asynchronous circuits; the system also includes minimum and near-minimum variable state assignment generators, a code evaluation routine, a design equation generator, and two Boolean equation simplification procedures. Large sequential circuits designed using the system illustrate the utility of the heuristic procedures --Abstract, pages ii-iii

    Density-Aware Linear Algebra in a Column-Oriented In-Memory Database System

    Get PDF
    Linear algebra operations appear in nearly every application in advanced analytics, machine learning, and of various science domains. Until today, many data analysts and scientists tend to use statistics software packages or hand-crafted solutions for their analysis. In the era of data deluge, however, the external statistics packages and custom analysis programs that often run on single-workstations are incapable to keep up with the vast increase in data volume and size. In particular, there is an increasing demand of scientists for large scale data manipulation, orchestration, and advanced data management capabilities. These are among the key features of a mature relational database management system (DBMS). With the rise of main memory database systems, it now has become feasible to also consider applications that built up on linear algebra. This thesis presents a deep integration of linear algebra functionality into an in-memory column-oriented database system. In particular, this work shows that it has become feasible to execute linear algebra queries on large data sets directly in a DBMS-integrated engine (LAPEG), without the need of transferring data and being restricted by hard disc latencies. From various application examples that are cited in this work, we deduce a number of requirements that are relevant for a database system that includes linear algebra functionality. Beside the deep integration of matrices and numerical algorithms, these include optimization of expressions, transparent matrix handling, scalability and data-parallelism, and data manipulation capabilities. These requirements are addressed by our linear algebra engine. In particular, the core contributions of this thesis are: firstly, we show that the columnar storage layer of an in-memory DBMS yields an easy adoption of efficient sparse matrix data types and algorithms. Furthermore, we show that the execution of linear algebra expressions significantly benefits from different techniques that are inspired from database technology. In a novel way, we implemented several of these optimization strategies in LAPEG’s optimizer (SpMachO), which uses an advanced density estimation method (SpProdest) to predict the matrix density of intermediate results. Moreover, we present an adaptive matrix data type AT Matrix to obviate the need of scientists for selecting appropriate matrix representations. The tiled substructure of AT Matrix is exploited by our matrix multiplication to saturate the different sockets of a multicore main-memory platform, reaching up to a speed-up of 6x compared to alternative approaches. Finally, a major part of this thesis is devoted to the topic of data manipulation; where we propose a matrix manipulation API and present different mutable matrix types to enable fast insertions and deletes. We finally conclude that our linear algebra engine is well-suited to process dynamic, large matrix workloads in an optimized way. In particular, the DBMS-integrated LAPEG is filling the linear algebra gap, and makes columnar in-memory DBMS attractive as efficient, scalable ad-hoc analysis platform for scientists
    • …
    corecore