51 research outputs found

    A design methodology for portable software on parallel computers

    Get PDF
    This final report for research that was supported by grant number NAG-1-995 documents our progress in addressing two difficulties in parallel programming. The first difficulty is developing software that will execute quickly on a parallel computer. The second difficulty is transporting software between dissimilar parallel computers. In general, we expect that more hardware-specific information will be included in software designs for parallel computers than in designs for sequential computers. This inclusion is an instance of portability being sacrificed for high performance. New parallel computers are being introduced frequently. Trying to keep one's software on the current high performance hardware, a software developer almost continually faces yet another expensive software transportation. The problem of the proposed research is to create a design methodology that helps designers to more precisely control both portability and hardware-specific programming details. The proposed research emphasizes programming for scientific applications. We completed our study of the parallelizability of a subsystem of the NASA Earth Radiation Budget Experiment (ERBE) data processing system. This work is summarized in section two. A more detailed description is provided in Appendix A ('Programming Practices to Support Eventual Parallelism'). Mr. Chrisman, a graduate student, wrote and successfully defended a Ph.D. dissertation proposal which describes our research associated with the issues of software portability and high performance. The list of research tasks are specified in the proposal. The proposal 'A Design Methodology for Portable Software on Parallel Computers' is summarized in section three and is provided in its entirety in Appendix B. We are currently studying a proposed subsystem of the NASA Clouds and the Earth's Radiant Energy System (CERES) data processing system. This software is the proof-of-concept for the Ph.D. dissertation. We have implemented and measured the performance of a portion of this subsystem on the Intel iPSC/2 parallel computer. These results are provided in section four. Our future work is summarized in section five, our acknowledgements are stated in section six, and references for published papers associated with NAG-1-995 are provided in section seven

    Parallel processing and expert systems

    Get PDF
    Whether it be monitoring the thermal subsystem of Space Station Freedom, or controlling the navigation of the autonomous rover on Mars, NASA missions in the 1990s cannot enjoy an increased level of autonomy without the efficient implementation of expert systems. Merely increasing the computational speed of uniprocessors may not be able to guarantee that real-time demands are met for larger systems. Speedup via parallel processing must be pursued alongside the optimization of sequential implementations. Prototypes of parallel expert systems have been built at universities and industrial laboratories in the U.S. and Japan. The state-of-the-art research in progress related to parallel execution of expert systems is surveyed. The survey discusses multiprocessors for expert systems, parallel languages for symbolic computations, and mapping expert systems to multiprocessors. Results to date indicate that the parallelism achieved for these systems is small. The main reasons are (1) the body of knowledge applicable in any given situation and the amount of computation executed by each rule firing are small, (2) dividing the problem solving process into relatively independent partitions is difficult, and (3) implementation decisions that enable expert systems to be incrementally refined hamper compile-time optimization. In order to obtain greater speedups, data parallelism and application parallelism must be exploited

    Application of Pfortran and Co-Array Fortran in the parallelization of the GROMOS96 molecular dynamics module

    Get PDF
    After at least a decade of parallel tool development, parallelization of scientific applications remains a significant undertaking. Typically parallelization is a specialized activity supported only partially by the programming tool set, with the programmer involved with parallel issues in addition to sequential ones. The details of concern range from algorithm design down to low-level data movement details. The aim of parallel programming tools is to automate the latter without sacrificing performance and portability, allowing the programmer to focus on algorithm specification and development. We present our use of two similar parallelization tools, Pfortran and Cray's Co-Array Fortran, in the parallelization of the GROMOS96 molecular dynamics module. Our parallelization started from the GROMOS96 distribution's sharedmemory implementation of the replicated algorithm, but used little of that existing parallel structure. Consequently, our parallelization was close to starting with the sequential version. We found the intuitive extensions to Pfortran and Co-Array Fortran helpful in the rapid parallelization of the project. We present performance figures for both the Pfortran and CoArray Fortran parallelizations showing linear speedup within the range expected by these parallelization methods

    Parallel machine architecture and compiler design facilities

    Get PDF
    The objective is to provide an integrated simulation environment for studying and evaluating various issues in designing parallel systems, including machine architectures, parallelizing compiler techniques, and parallel algorithms. The status of Delta project (which objective is to provide a facility to allow rapid prototyping of parallelized compilers that can target toward different machine architectures) is summarized. Included are the surveys of the program manipulation tools developed, the environmental software supporting Delta, and the compiler research projects in which Delta has played a role

    Limits to parallelism in scientific computing

    Get PDF
    The goal of our research is to decrease the execution time of scientific computing applications. We exploit the application\u27s inherent parallelism to achieve this goal. This exploitation is expensive as we analyze sequential applications and port them to parallel computers. Many scientifically computational problems appear to have considerable exploitable parallelism; however, upon implementing a parallel solution on a parallel computer, limits to the parallelism are encountered. Unfortunately, many of these limits are characteristic of a specific parallel computer. This thesis explores these limits.;We study the feasibility of exploiting the inherent parallelism of four NASA scientific computing applications. We use simple models to predict each application\u27s degree of parallelism at several levels of granularity. From this analysis, we conclude that it is infeasible to exploit the inherent parallelism of two of the four applications. The interprocessor communication of one application is too expensive relative to its computation cost. The input and output costs of the other application are too expensive relative to its computation cost. We exploit the parallelism of the remaining two applications and measure their performance on an Intel iPSC/2 parallel computer. We parallelize an Optimal Control Boundary Value Problem. This guidance control problem determines an optimal trajectory of a boat in a river. We parallelize the Carbon Dioxide Slicing technique which is a macrophysical cloud property retrieval algorithm. This technique computes the height at the top of a cloud using cloud imager measurements. We consider the feasibility of exploiting its massive parallelism on a MasPar MP-2 parallel computer. We conclude that many limits to parallelism are surmountable while other limits are inescapable.;From these limits, we elucidate some fundamental issues that must be considered when porting similar problems to yet-to-be designed computers. We conclude that the technological improvements to reduce the isolation of computational units frees a programmer from many of the programmer\u27s current concerns about the granularity of the work. We also conclude that the technological improvements to relax the regimented guidance of the computational units allows a programmer to exploit the inherent heterogeneous parallelism of many applications

    Parallel LISP

    Get PDF
    Projects in the past few years have looked into the problem of automatic parallelization of the Lisp programming language. Since it appears to be feasible to adapt Lisp to run on a general parallel computer, an implementation will be developed. This implementation will be as general as possible in order to locate the tradeoffs between implementing Lisp on a general parallel computer versus having an efficient interpreter. This implementation can be used to study the execution characteristics of Lisp in a parallel environment. It can also be used to derive information about architectural features which affect the performance of Lisp on parallel machines. This implementation will use a multitasking system and interprocess communication to simulate an MIMD machine. The implementation will include the formation, queuing, distribution, and execution of dataflow frames. Realistic Lisp application programs will be used with the implementation to examine the feasibility and efficiency of parallel Lisp. Measurements derivable from the simulator include number of processor cycles, processor utilization, memory requirements, and speedup. These tests will provide two main results. First, they will indicate possibilities for further gains by illustrating the bottlenecks in such a scheme. Second, they will help determine if it is indeed feasible to run Lisp on a parallel machine or if instead the overhead is too high for the application to be profitable. Most likely, some parallelism will be profitable. The simulation will provide information on the extent to which parallelism can be utilized

    Language Constructs for Data Partitioning and Distribution

    Get PDF

    High-Performance Computing and Four-Dimensional Data Assimilation: The Impact on Future and Current Problems

    Get PDF
    This is the final technical report for the project entitled: "High-Performance Computing and Four-Dimensional Data Assimilation: The Impact on Future and Current Problems", funded at NPAC by the DAO at NASA/GSFC. First, the motivation for the project is given in the introductory section, followed by the executive summary of major accomplishments and the list of project-related publications. Detailed analysis and description of research results is given in subsequent chapters and in the Appendix
    • …
    corecore