19,439 research outputs found

    CellSim: a validated modular heterogeneous multiprocessor simulator

    Get PDF
    As the number of transistors on a chip continues increasing the power consumption has become the most important constraint in processors design. Therefore, to increase performance, computer architects have decided to use multiprocessors. Moreover, recent studies have shown that heterogeneous chip multiprocessors have greater potential than homogeneous ones. We have built a modular simulator for heterogeneous multiprocessors that can be configure to model IBM's Cell Processor. The simulator has been validated against the real machine to be used as a research tool.Peer ReviewedPostprint (published version

    Validation of a fault-tolerant multiprocessor: Baseline experiments and workload implementation

    Get PDF
    In the future, aircraft must employ highly reliable multiprocessors in order to achieve flight safety. Such computers must be experimentally validated before they are deployed. This project outlines a methodology for validating reliable multiprocessors. The methodology begins with baseline experiments, which tests a single phenomenon. As experiments progress, tools for performance testing are developed. The methodology is used, in part, on the Fault Tolerant Multiprocessor (FTMP) at NASA-Langley's AIRLAB facility. Experiments are designed to evaluate the fault-free performance of the system. Presented are the results of interrupt baseline experiments performed on FTMP. Interrupt causing exception conditions were tested, and several were found to have unimplemented interrupt handling software while one had an unimplemented interrupt vector. A synthetic workload model for realtime multiprocessors is then developed as an application level performance analysis tool. Details of the workload implementation and calibration are presented. Both the experimental methodology and the synthetic workload model are general enough to be applicable to reliable multiprocessors beside FTMP

    Shared versus distributed memory multiprocessors

    Get PDF
    The question of whether multiprocessors should have shared or distributed memory has attracted a great deal of attention. Some researchers argue strongly for building distributed memory machines, while others argue just as strongly for programming shared memory multiprocessors. A great deal of research is underway on both types of parallel systems. Special emphasis is placed on systems with a very large number of processors for computation intensive tasks and considers research and implementation trends. It appears that the two types of systems will likely converge to a common form for large scale multiprocessors

    Adaptive runtime-assisted block prefetching on chip-multiprocessors

    Get PDF
    Memory stalls are a significant source of performance degradation in modern processors. Data prefetching is a widely adopted and well studied technique used to alleviate this problem. Prefetching can be performed by the hardware, or be initiated and controlled by software. Among software controlled prefetching we find a wide variety of schemes, including runtime-directed prefetching and more specifically runtime-directed block prefetching. This paper proposes a hybrid prefetching mechanism that integrates a software driven block prefetcher with existing hardware prefetching techniques. Our runtime-assisted software prefetcher brings large blocks of data on-chip with the support of a low cost hardware engine, and synergizes with existing hardware prefetchers that manage locality at a finer granularity. The runtime system that drives the prefetch engine dynamically selects which cache to prefetch to. Our evaluation on a set of scientific benchmarks obtains a maximum speed up of 32 and 10 % on average compared to a baseline with hardware prefetching only. As a result, we also achieve a reduction of up to 18 and 3 % on average in energy-to-solution.Peer ReviewedPostprint (author's final draft

    Investigation of LSTM Based Prediction for Dynamic Energy Management in Chip Multiprocessors

    Get PDF
    In this paper, we investigate the effectiveness of using long short-term memory (LSTM) instead of Kalman filtering to do prediction for the purpose of constructing dynamic energy management (DEM) algorithms in chip multi-processors (CMPs). Either of the two prediction methods is employed to estimate the workload in the next control period for each of the processor cores. These estimates are then used to select voltage-frequency (VF) pairs for each core of the CMP during the next control period as part of a dynamic voltage and frequency scaling (DVFS) technique. The objective of the DVFS technique is to reduce energy consumption under performance constraints that are set by the user. We conduct our investigation using a custom Sniper system simulation framework. Simulation results for 16 and 64 core network-on-chip based CMP architectures and using several benchmarks demonstrate that the LSTM is slightly better than Kalman filtering

    Investigation of LSTM Based Prediction for Dynamic Energy Management in Chip Multiprocessors

    Get PDF
    In this paper, we investigate the effectiveness of using long short-term memory (LSTM) instead of Kalman filtering to do prediction for the purpose of constructing dynamic energy management (DEM) algorithms in chip multi-processors (CMPs). Either of the two prediction methods is employed to estimate the workload in the next control period for each of the processor cores. These estimates are then used to select voltage-frequency (VF) pairs for each core of the CMP during the next control period as part of a dynamic voltage and frequency scaling (DVFS) technique. The objective of the DVFS technique is to reduce energy consumption under performance constraints that are set by the user. We conduct our investigation using a custom Sniper system simulation framework. Simulation results for 16 and 64 core network-on-chip based CMP architectures and using several benchmarks demonstrate that the LSTM is slightly better than Kalman filtering
    corecore