25 research outputs found

    Energy-efficient and high-performance lock speculation hardware for embedded multicore systems

    Full text link
    Embedded systems are becoming increasingly common in everyday life and like their general-purpose counterparts, they have shifted towards shared memory multicore architectures. However, they are much more resource constrained, and as they often run on batteries, energy efficiency becomes critically important. In such systems, achieving high concurrency is a key demand for delivering satisfactory performance at low energy cost. In order to achieve this high concurrency, consistency across the shared memory hierarchy must be accomplished in a cost-effective manner in terms of performance, energy, and implementation complexity. In this article, we propose Embedded-Spec, a hardware solution for supporting transparent lock speculation, without the requirement for special supporting instructions. Using this approach, we evaluate the energy consumption and performance of a suite of benchmarks, exploring a range of contention management and retry policies. We conclude that for resource-constrained platforms, lock speculation can provide real benefits in terms of improved concurrency and energy efficiency, as long as the underlying hardware support is carefully configured.This work is supported in part by NSF under Grants CCF-0903384, CCF-0903295, CNS-1319495, and CNS-1319095 as well the Semiconductor Research Corporation under grant number 1983.001. (CCF-0903384 - NSF; CCF-0903295 - NSF; CNS-1319495 - NSF; CNS-1319095 - NSF; 1983.001 - Semiconductor Research Corporation

    Evaluating critical bits in arithmetic operations due to timing violations

    Full text link
    Various error models are being used in simulation of voltage-scaled arithmetic units to examine application-level tolerance of timing violations. The selection of an error model needs further consideration, as differences in error models drastically affect the performance of the application. Specifically, floating point arithmetic units (FPUs) have architectural characteristics that characterize its behavior. We examine the architecture of FPUs and design a new error model, which we call Critical Bit. We run selected benchmark applications with Critical Bit and other widely used error injection models to demonstrate the differences

    Energy reduction in multiprocessor systems using transactional memory

    Get PDF

    Energy reduction in multiprocessor systems using transactional memory

    No full text
    The emphasis in microprocessor design has shifted from high performance, to a combination of high performance and low power. Until recently, this trend was mostly true for uniprocessors. In this work we focus on new energy consumption issues unique to multiprocessor systems: synchronization of accesses to shared memory. We investigate and compare different means of providing atomic access to shared memory, including locks and lock-free synchronization (i.e., transactional memory), with respect to energy as well as performance. We show that transactional memory has an advantage in terms of energy consumption over locks, but that this advantage largely depends on the system architecture, the contention level, and the policy of conflict resolution

    Energy-Aware Microprocessor Synchronization: Transactional Memory vs. Locks

    No full text
    One important way in which multiprocessors differ from uniprocessors is in the need to provide programmers the ability to synchronize concurrent access to memory. Transactional memory was proposed as a way of improving throughput especially when the rate of synchronization conflict is low. In this paper we explore power implications of transactional memory on standard and synthetic benchmarks. We propose a new “serial execution” mode that lowers energy consumption during high contention periods by reducing transaction throughput. We conclude that transactional models are a promising approach to low-power synchronization, and serial execution strengthens the energy advantage, but that further work is needed to fully understand how transactions compare to locks at high levels of contention
    corecore