2 research outputs found

    Characterizing a New Class of Threads in Scientific Applications for High End Supercomputers

    No full text
    Chip level multithreading is growing in use throughout the microprocessor world as evidenced in the Intel Pentium 4 and the upcoming innovations in the POWER architecture. These processors typically use a few coarse grain threads that can be difficult for the programmer or compiler to exploit; however, Processing in Memory (PIM) is a technology that has been explored through a long series of supercomputer projects as a facilitator for a different multithreaded execution models. In the multithreading model explored by PIMs, the threads can have radically different characteristics. Specifically, PIMs seek to exploit a large number of very fine grained threads to hide memory access latency and increase parallelism. PIM supports these small threads, or "threadlets", by providing a fast hardware synchronization mechanism, support for harware managment of creation and destruction of threads, and a "shared register" approach which extends the shared memory thread model. This paper discusses some analysis of Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under contract DE-AC0494AL85000

    Reducing exception management overhead with software restart markers

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 181-196).Modern processors rely on exception handling mechanisms to detect errors and to implement various features such as virtual memory. However, these mechanisms are typically hardware-intensive because of the need to buffer partially-completed instructions to implement precise exceptions and enforce in-order instruction commit, often leading to issues with performance and energy efficiency. The situation is exacerbated in highly parallel machines with large quantities of programmer-visible state, such as VLIW or vector processors. As architects increasingly rely on parallel architectures to achieve higher performance, the problem of exception handling is becoming critical. In this thesis, I present software restart markers as the foundation of an exception handling mechanism for explicitly parallel architectures. With this model, the compiler is responsible for delimiting regions of idempotent code. If an exception occurs, the operating system will resume execution from the beginning of the region. One advantage of this approach is that instruction results can be committed to architectural state in any order within a region, eliminating the need to buffer those values. Enabling out-of-order commit can substantially reduce the exception management overhead found in precise exception implementations, and enable the use of new architectural features that might be prohibitively costly with conventional precise exception implementations. Additionally, software restart markers can be used to reduce context switch overhead in a multiprogrammed environment. This thesis demonstrates the applicability of software restart markers to vector, VLIW, and multithreaded architectures. It also contains an implementation of this exception handling approach that uses the Trimaran compiler infrastructure to target the Scale vectorthread architecture. I show that using software restart markers incurs very little performance overhead for vector-style execution on Scale.(cont.) Finally, I describe the Scale compiler flow developed as part of this work and discuss how it targets certain features facilitated by the use of software restart markersby Mark Jerome Hampton.Ph.D
    corecore