11 research outputs found

    Influence of Input/output Operations on Processor Performance

    Get PDF
    Nowadays, computers are frequently equipped with peripherals that transfer great amounts of data between them and the system memory using direct memory access techniques (i.e., digital cameras, high speed networks, . . . ). Those peripherals prevent the processor from accessing system memory for significant periods of time (i.e., while they are communicating with system memory in order to send or receive data blocks). In this paper we study the negative effects that I/O operations from computer peripherals have on processor performance. With the help of a set of routines (SMPL) used to make discrete event simulators, we have developed a configurable software that simulates a computer processor and main memory as well as the I/O scenarios where the periph-erals operate. This software has been used to analyze the performance of four different processors in four I/O scenarios: video capture, video capture and playback, high speed network, and serial transmission

    Architectural Exploration of Data Recomputation for Improving Energy Efficiency

    Get PDF
    University of Minnesota Ph.D. dissertation. July 2017. Major: Electrical/Computer Engineering. Advisor: Ulya Karpuzcu. 1 computer file (PDF); viii, 99 pages.There are two fundamental challenges for modern computer system design. The first one is accommodating the increasing demand for performance in a tight power budget. The second one is ensuring correct progress despite the increasing possibility of faults that may occur in the system. To address the first challenge, it is essential to track where the power goes. The energy consumption of data orchestration (i.e., storage, movement, communication) dominates the energy consumption of actual data production, i.e., computation. Oftentimes, recomputing data becomes more energy efficient than storing and retrieving pre-computed data by minimizing the prevalent power and performance overhead of data storage, retrieval, and communication. At the same time, recomputation can reduce the demand for communication bandwidth and shrink the memory footprint. In the first half of the dissertation, the potential of data recomputation in improving energy efficiency is quantified and a practical recomputation framework is introduced to trade computation for communication. To address the second challenge, it is needed to provide scalable checkpointing and recovery mechanisms. The traditional method to recover from a fault is to periodically checkpoint the state of the machine. Periodic checkpointing of the machine state makes rollback and restart of execution from a safe state possible upon detection of a fault. The energy overhead of checkpointing, however, as incurred by storage and communication of the machine state grows with the frequency of checkpointing. Amortizing this overhead becomes especially challenging, considering the growth of expected error rates as an artifact of contemporary technology scaling. Recomputation of data (which otherwise would be read from a checkpoint) can reduce both the frequency of checkpointing, the size of the checkpoints and thereby mitigate checkpointing overhead. In the second half, quantitative characterization of recomputation-enabled checkpointing (based on recomputation framework) is provided

    DataScalar Architectures and the SPSD Execution Model

    No full text
    1 The increasing power of commodity microprocessors is forcing system designers to provide more complex and expensive memory hierarchies. A potentially cheaper and better-performing alternative in the long run is to integrate the processor and main memory on the same die or module. In this paper, we propose an architecture (DATASCALAR) and an execution model (SPSD) that permit efficient execution of uniprocessor programs across multiple integrated processor/memory modules. We then describe four features of this proposal that permit improved performance: ESP gains, memory prefetching, result communication, and hybrid parallel execution. Finally, we present examples and measurements, which give evidence that each feature will improve performance on future systems that have very expensive off-chip communication. 1 Introduction Modern microprocessors exhibit very high levels of performance, which nevertheless continue to increase exponentially [1]. These CPUs require a memory system th..

    Revisiting the Massive Memory Machine for Processing in Memory

    No full text
    Many of the modern problems in computing are, due to their nature, memory-bound. This essentially means that adding more and more computing power, ie., more and faster CPUs, would not allow them to run faster, as one would expect since the bottleneck is the movement of the data from memory to the CPU, which can also be costly in terms of energy consumption.  This thesis investigates a collection of ideas that aim to efficiently tackle such problems, such as the Massive Memory Machine, its successor DataScalar architectures, Processing in Memory architectures, and Memory interleaving. The thesis describes an attempt to implement the Massive Memory Machine on the Champsim simulator. It also explains why the DAMOV simulator and UPMEM PIM architecture can not support the Massive Memory Machine, due to lack of communication provided between CPUs.   The Massive Memory Machine improved the performance of the programs that were tested. Although the performance did not scale linearly with the number of CPUs added, it improved sufficiently.

    Revisiting the Massive Memory Machine for Processing in Memory

    No full text
    Many of the modern problems in computing are, due to their nature, memory-bound. This essentially means that adding more and more computing power, ie., more and faster CPUs, would not allow them to run faster, as one would expect since the bottleneck is the movement of the data from memory to the CPU, which can also be costly in terms of energy consumption.  This thesis investigates a collection of ideas that aim to efficiently tackle such problems, such as the Massive Memory Machine, its successor DataScalar architectures, Processing in Memory architectures, and Memory interleaving. The thesis describes an attempt to implement the Massive Memory Machine on the Champsim simulator. It also explains why the DAMOV simulator and UPMEM PIM architecture can not support the Massive Memory Machine, due to lack of communication provided between CPUs.   The Massive Memory Machine improved the performance of the programs that were tested. Although the performance did not scale linearly with the number of CPUs added, it improved sufficiently.

    Skirting Amdahl's Law: Using SPSD Execution with Optical Interconnects

    No full text
    Optical interconnects provide new parallel processing opportunities through inexpensive broadcasts high-bandwidth, point-to-point connections. However, the problems of flow control and buffering inhibit current parallel architectures from effectively exploiting the advantages of optical interconnects. We propose using an execution model, called Single Program, Single Data stream (SPSD) to exploit inexpensive optical broadcasts and reduce the serial overheads of parallel programs. We describe one possible implementation of such a system (DataScalar), and discuss how future systems can be designed to better exploit optical interconnects. 1 Introduction From the architect's perspective, optical interconnects provide a new and interesting set of opportunities for designing multiprocessor systems. Intrinsically high bandwidth, potentially low latencies, and the "free space" nature of optical interconnects all provide new opportunities for both evolutionary and revolutionary changes in mul..
    corecore