5 research outputs found

    Modeling cost/performance of a parallel computer simulator

    Get PDF
    This article examines the cost/performance of simulating a hypothetical target parallel computer using a commercial host parallel computer. We address the question of whether parallel simulation is simply faster than sequential simulation, or if it is also more cost-effective. To answer this, we develop a performance model of the Wisconsin Wind Tunnel (WWT), a system that simulates cache-coherent shared-memory machines on a message-passing Thinking Machines CM-5. The performance model uses Kruskal and Weiss's fork-join model to account for the effect of event processing time variability on WWT's conservative fixed-window simulation algorithm. A generalization of Thiebaut and Stone's footprint model accurately predicts the effect of cache interference on the CM-5. The model is calibrated using parameters extracted from a fully parallel simulation (p = N), and validated by measuring the speedup as the number of processors (p) ranges from 1 to the number of target nodes (N). Together with simple cost models, the performance model indicates that for target system sizes of 32 nodes and larger, parallel simulation is more cost-effective than sequential simulation. The key intuition behind this result is that large simulations require large memories, which dominate the cost of a uniprocessor; parallel computers allow multiple processors to simultaneously access this large memory

    Sirocco: cost-effective fine-grain distributed shared memory

    Get PDF
    Software fine-grain distributed shared memory (FGDSM) provides a simplified shared-memory programming interface with minimal or no hardware support. Originally software FGDSMs targeted uniprocessor-node parallel machines. This paper presents Sirocco, a family of software FGDSMs implemented on a network of low-cost SMPs. Sirocco takes full advantage of SMP nodes by implementing inter-node sharing directly in hardware and overlapping computation with protocol execution. To maintain correct shared-memory semantics, however SMP nodes require mechanisms to guarantee atomic coherence operations. Multiple SMP processors may also result in contention for shared resources and reduce performance. SMP nodes also impact the cost trade-off. While SMPs typically charge higher price-premiums, for a given system size SMP nodes substantially reduce networking hardware requirement as compared to uniprocessor nodes. In this paper, we ask the question “Are SMPs cost-effective building blocks for software FGDSM?” We present experimental measurements on Sirocco implementations ranging from an all-software system to a system with minimal hardware support. Together with simple cost models we show that low-cost SMP nodes: (i) result in competitive performance with uniprocessor nodes, (ii) substantially reduce hardware requirement and are more cost- effective than uniprocessor nodes, (iii) significantly benefit from hardware support for coherence operations, and (iv) are especially beneficial for FGDSMs with high-overhead coherence operation

    Design of Thermal Management Control Policies for Multiprocessors Systems on Chip

    Get PDF
    The contribution of this thesis is a thorough study of thermal aware policy design for MPSoCs. The study includes the modelling of their thermal behavior as well as the improvement and the definition of new thermal management and balancing policies. The work is structured on three main specific disciplines. The areas of contributions are: modeling, algorithms and system design. This thesis extends the field of modeling by proposing new techniques to represent the thermal behavior of MPSoCs using a mathematical formalization. Heat transfer and modelling of physical properties of MPSoCs have been extensively studied. Special emphasis is given to the way the system cools down (i.e. micro-cooling, natural heat dissipation etc.) and the heat propagates inside the MPSoC. The second contribution of this work is related to policies, which manage MPSoC working frequencies and micro-cooling pumps to satisfy user requirements in the most effective possible way, while consuming the lowest possible amount of resources. Several families of thermal policies algorithms have been studied and analyzed in this work for both 2D and 3D MPSoCs including liquid cooling technologies. The discipline of system design has also been extended during the development of this thesis. Thermal management policies have been implemented in real emulation platforms and contributions in this area are related to the design and implementation of proposed innovations in real MPSoC platforms

    Modeling Cost/Performance of a Parallel Computer Simulator

    No full text
    ing with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept, ACM Inc., 1515 Broadway, New York, NY 10036 USA, fax +1 (212) 869-0481, or [email protected]. Modeling Cost/Performance of a Parallel Computer Simulator Babak Falsafi and David A. Wood This paper examines the cost/performance of simulating a hypothetical target parallel computer using a commercial host parallel computer. We address the question of whether parallel simulation is simply faster than sequential simulation, or if it is also more cost-effective. To answer this, we develop a performance model of the Wisconsin Wind Tunnel (WWT), a system that simulates cache-coherent shared-memory machines on a message-passing Thinking Machines CM5. The performance model uses Kruskal and Weiss's fork-join model to account for the..
    corecore