4,534 research outputs found

    Turbomachinery CFD on parallel computers

    Get PDF
    The role of multistage turbomachinery simulation in the development of propulsion system models is discussed. Particularly, the need for simulations with higher fidelity and faster turnaround time is highlighted. It is shown how such fast simulations can be used in engineering-oriented environments. The use of parallel processing to achieve the required turnaround times is discussed. Current work by several researchers in this area is summarized. Parallel turbomachinery CFD research at the NASA Lewis Research Center is then highlighted. These efforts are focused on implementing the average-passage turbomachinery model on MIMD, distributed memory parallel computers. Performance results are given for inviscid, single blade row and viscous, multistage applications on several parallel computers, including networked workstations

    The "MIND" Scalable PIM Architecture

    Get PDF
    MIND (Memory, Intelligence, and Network Device) is an advanced parallel computer architecture for high performance computing and scalable embedded processing. It is a Processor-in-Memory (PIM) architecture integrating both DRAM bit cells and CMOS logic devices on the same silicon die. MIND is multicore with multiple memory/processor nodes on each chip and supports global shared memory across systems of MIND components. MIND is distinguished from other PIM architectures in that it incorporates mechanisms for efficient support of a global parallel execution model based on the semantics of message-driven multithreaded split-transaction processing. MIND is designed to operate either in conjunction with other conventional microprocessors or in standalone arrays of like devices. It also incorporates mechanisms for fault tolerance, real time execution, and active power management. This paper describes the major elements and operational methods of the MIND architecture

    NCESPARC+: an implementation of a SPARC architecture with hardware support to multithreading for the multiplus multiprocessor

    Get PDF
    NCESP ARC + is an implementation of the SP ARC v: 8 architecture with hardware support to a variable number of thread contexts, which is under development for use within the framework of the Multiplus distributed shared-memory multiprocessor. It is expected to provide an efficient and automatic mechanism to hide the latency of busy-waiting synchronization loops, cachecoherence protocol and remote memory access operations within the Multiplus multiprocessor. NCESPARC + performs context-switching in at most four processor cycles whenever there is an instruction cache miss, a data dependency in relation to the destination operand of a pending load instruction or a busy-waiting synchronization loop. It has a decoupled architecture which allows the main pipeline to process instructions from a given context while the Memory Interface Unit performs memory access operations related to that same context or to any other context. Results of simulation experiments show the impact of some architectural parameters on the NCESPARC + processor performance and demonstrate that the use of multiple thread contexts can e.ffectively produce a much better utilization of the processor when long latency operations are performed In addition, NCESPARC + processor performance with a single context is superior to that of a standard implementation of the SPARC architecture due to its decoupled architecture

    Shared versus distributed memory multiprocessors

    Get PDF
    The question of whether multiprocessors should have shared or distributed memory has attracted a great deal of attention. Some researchers argue strongly for building distributed memory machines, while others argue just as strongly for programming shared memory multiprocessors. A great deal of research is underway on both types of parallel systems. Special emphasis is placed on systems with a very large number of processors for computation intensive tasks and considers research and implementation trends. It appears that the two types of systems will likely converge to a common form for large scale multiprocessors
    corecore