19 research outputs found

    KfK-SUPRENUM-Seminar 19.-20.10.1989. Tagungsbericht

    Get PDF

    Towards the Teraflop CFD

    Get PDF
    We are surveying current projects in the area of parallel supercomputers. The machines considered here will become commercially available in the 1990 - 1992 time frame. All are suitable for exploring the critical issues in applying parallel processors to large scale scientific computations, in particular CFD calculations. This chapter presents an overview of the surveyed machines, and a detailed analysis of the various architectural and technology approaches taken. Particular emphasis is placed on the feasibility of a Teraflops capability following the paths proposed by various developers

    A bibliography on parallel and vector numerical algorithms

    Get PDF
    This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also

    A shared memory multi-microprocessor system with hardware supported message passing mechanisms.

    Get PDF
    by Lam Chin Hung.Thesis (M.Phil.)--Chinese University of Hong Kong, 1990.Bibliography: leaves 167-174.ABSTRACT --- p.1ACKNOWLEDGEMENTS --- p.2TABLE OF CONTENTS --- p.3Chapter CHAPTER 1 --- INTRODUCTION --- p.1Chapter 1.1 --- Gaining performance with multiprocessing --- p.1Chapter 1.1.1 --- Software approach --- p.2Chapter 1.1.2 --- hardware approach --- p.2Chapter 1.2 --- Parallel processing --- p.4Chapter 1.3 --- Gaining performance with multiprocessing --- p.7Chapter 1.3.1 --- Multiprocessor configurations --- p.7Chapter 1.3.2 --- Multiprocessor design issues --- p.9Chapter 1.3.3 --- Using microprocessors --- p.11Chapter 1.3.4 --- Bus based systems --- p.12Chapter 1.4 --- Shared memory and message passing --- p.13Chapter 1.4.1 --- Shared memory --- p.13Chapter 1.4.2 --- Message passing --- p.14Chapter 1.4.3 --- Comparisons of the two paradigms --- p.16Chapter 1.5 --- Summary and comment --- p.19Chapter CHAPTER 2 --- AN OVERVIEW OF COMMON APPROACHES --- p.20Chapter 2.1 --- SUPRENUM --- p.20Chapter 2.2 --- MEMSY --- p.22Chapter 2.3 --- ELXSI --- p.24Chapter 2.4 --- Sequent --- p.25Chapter 2.5 --- YACKOS --- p.26Chapter 2.6 --- Summary --- p.30Chapter CHAPTER 3 --- THE MPC APPROACH --- p.32Chapter 3.1 --- A shared memory multiprocessor architecture --- p.32Chapter 3.2 --- Message passer for inter-process communication --- p.32Chapter 3.2.1 --- A review of the message passer approach --- p.33Chapter 3.2.2 --- Pit-falls of the message passer approach --- p.34Chapter 3.3 --- The role of the MPC --- p.35Chapter 3.3.1 --- The quest for the MPC --- p.35Chapter 3.3.2 --- Duties of the MPC --- p.37Chapter 3.3.2.1 --- Software aspects --- p.37Chapter 3.3.2.2 --- Hardware aspects --- p.40Chapter 3.4 --- Advantages and disadvantages --- p.41Chapter 3.4.1 --- Advantages --- p.41Chapter 3.4.2 --- Disadvantages --- p.43Chapter 3.4.3 --- Other discussions --- p.44Chapter 3.5 --- Summary --- p.44Chapter CHAPTER 4 --- THE DESIGN OF SM3 --- p.46Chapter 4.1 --- Introduction to SM3 --- p.45Chapter 4.2 --- Software aspects --- p.47Chapter 4.2.1 --- Programming model --- p.48Chapter 4.2.1.1 --- Logical entities --- p.48Chapter 4.2.1.2 --- Communication procedure --- p.48Chapter 4.2.2 --- Message structure --- p.51Chapter 4.2.2.1 --- Broadcast versus point-to-point messages --- p.52Chapter 4.2.2.2 --- Message priority --- p.52Chapter 4.2.2.3 --- Blocking versus non-blocking --- p.53Chapter 4.3 --- Hardware aspects --- p.55Chapter 4.3.1 --- Overall architecture --- p.55Chapter 4.3.2 --- The host machineChapter 4.3.3 --- Slave processor nodes --- p.57Chapter 4.3.4 --- The MPC --- p.59Chapter 4.4 --- Communication protocols --- p.60Chapter 4.4.1 --- Short and long messages --- p.60Chapter 4.4.2 --- Point-to-point messages --- p.61Chapter 4.4.3 --- 1-to-N DMA for broadcast messages --- p.63Chapter 4.4.3.1 --- Introducing 1-to-N DMA --- p.63Chapter 4.4.3.2 --- 1-to-N DMA operation --- p.64Chapter 4.4.3.3 --- Merits and demerits of 1-to-N DMA --- p.67Chapter 4.5 --- Summary --- p.68Chapter CHAPTER 5 --- IMPLEMENTATION ISSUES OF SM3 --- p.70Chapter 5.1 --- The shared bus - VMEbus --- p.70Chapter 5.1.1 --- Why VMEbus --- p.70Chapter 5.1.2 --- Customizing the VMEbus --- p.71Chapter 5.2 --- The host machine --- p.71Chapter 5.3 --- Slave processor nodes --- p.72Chapter 5.3.1 --- Overview of a PN --- p.74Chapter 5.3.2 --- The MC68030 microprocessor --- p.77Chapter 5.3.3 --- The DMAC M68442 --- p.78Chapter 5.3.4 --- Registers --- p.79Chapter 5.3.5 --- Shared-bus interface --- p.80Chapter 5.3.6 --- Communication logic --- p.80Chapter 5.4 --- The MPC --- p.80Chapter 5.4.1 --- Overview of the MPC --- p.81Chapter 5.4.2 --- Registers --- p.81Chapter 5.4.3 --- Communication logic --- p.83Chapter 5.5 --- Protocol implementation --- p.84Chapter 5.5.1 --- Point-to-point messages --- p.84Chapter 5.5.2 --- Broadcast messages --- p.86Chapter 5.5.2.1 --- Circular buffer queue --- p.87Chapter 5.5.2.2 --- Participating entities --- p.87Chapter 5.5.2.3 --- Protocol details --- p.88Chapter 5.6 --- System start-up procedure --- p.94Chapter 5.6.1 --- Power up reset of PNs --- p.94Chapter 5.6.2 --- Initialization of the processor pool --- p.95Chapter 5.7 --- Summary --- p.95Chapter CHAPTER 6 --- APPLICATION EXAMPLES --- p.96Chapter 6.1 --- Introduction --- p.96Chapter 6.2 --- Matrix Multiplication --- p.96Chapter 6.3 --- Parallel Quicksort --- p.97Chapter 6.4 --- Pipeline Problems --- p.99Chapter CHAPTER 7 --- UNSOLVED PROBLEMS AND FUTURE DEVELOPMENT --- p.101Chapter 7.1 --- Current Status --- p.101Chapter 7.2 --- Possible immediate enhancements --- p.102Chapter 7.2.1 --- Enhancement to the PNs --- p.102Chapter 7.2.2 --- Enhancement of the MPC --- p.103Chapter 7.2.3 --- Communication kernel enhancement --- p.103Chapter 7.3 --- Limitation of a shared bus --- p.104Chapter 7.4 --- Number crunching capability --- p.105Chapter 7.5 --- Parallel programming environment --- p.105Chapter 7.5.1 --- Conform to serial language --- p.105Chapter 7.5.2 --- Moving to parallel programming languages --- p.106Chapter 7.5.2.1 --- Uni-processor Unix --- p.107Chapter 7.5.2.2 --- Porting Unix --- p.108Chapter 7.5.2.3 --- Multiprocessor Unix --- p.108Chapter 7.5.3 --- Object-oriented approach --- p.110Chapter 7.6 --- Summary --- p.112Chapter CHAPTER 8 --- CONCLUSION --- p.113Chapter 8.1 --- Thesis summary --- p.113Chapter 8.2 --- Author's comment --- p.114Chapter 8.3 --- Looking into the future --- p.116Chapter APPENDIX A --- BLOCK DIAGRAM --- p.117Chapter APPENDIX B --- CIRCUIT DIAGRAMS --- p.119Chapter APPENDIX C --- PCB LAYOUT --- p.126Chapter APPENDIX D --- VMEBUS ADDRESS MAP --- p.132Chapter APPENDIX E --- PROCESSOR NODE ADDRESS MAP --- p.133Chapter APPENDIX F --- REGISTER LAYOUT --- p.134Chapter F.1 --- Registers on a PN --- p.134Chapter F.2 --- Registers on the MPC --- p.134Chapter APPENDIX G --- PAL DESIGN --- p.136Chapter APPENDIX H --- COMMUNICATION SUB-BUS --- p.146Chapter H.1 --- Signal definition --- p.146Chapter H.2 --- Pin assignment --- p.146Chapter APPENDIX I --- FEASIBILITY OF TASK DISTRIBUTION PLAN --- p.147Chapter APPENDIX J --- COMMUNICATION PRIMITIVES --- p.148Chapter APPENDIX K --- PHOTOGRAPHS OF SM3 --- p.150Chapter APPENDIX L --- PROTOCOL STATE DIAGRAMS --- p.152Chapter L.1 --- Predefined partial state diagrams --- p.152Chapter L.2 --- Point-to-point messages --- p.152Chapter L.3 --- Broadcast messages --- p.154Chapter APPENDIX M --- BOOT-UP PROCEDURE OF SM3 --- p.159PUBLICATIONS --- p.161REFERENCES --- p.16

    Literature Review For Networking And Communication Technology

    Get PDF
    Report documents the results of a literature search performed in the area of networking and communication technology

    State-of-the-art Assessment For Simulated Forces

    Get PDF
    Summary of the review of the state of the art in simulated forces conducted to support the research objectives of Research and Development for Intelligent Simulated Forces

    Compiling Programs for Nonshared Memory Machines

    Get PDF
    Nonshared-memory parallel computers promise scalable performance for scientific computing needs. Unfortunately, these machines are now difficult to program because the message-passing languages available for them do not reflect the computational models used in designing algorithms. This introduces a semantic gap in the programming process which is difficult for the programmer to fill. The purpose of this research is to show how nonshared-memory machines can be programmed at a higher level than is currently possible. We do this by developing techniques for compiling shared-memory programs for execution on those architectures. The heart of the compilation process is translating references to shared memory into explicit messages between processors. To do this, we first define a formal model for distribution data structures across processor memories. Several abstract results describing the messages needed to execute a program are immediately derived from this formalism. We then develop two distinct forms of analysis to translate these formulas into actual programs. Compile-time analysis is used when enough information is available to the compiler to completely characterize the data sent in the messages. This allows excellent code to be generated for a program. Run-time analysis produces code to examine data references while the program is running. This allows dynamic generation of messages and a correct implementation of the program. While the over-head of the run-time approach is higher than the compile-time approach, run-time analysis is applicable to any program. Performance data from an initial implementation show that both approaches are practical and produce code with acceptable efficiency
    corecore