9 research outputs found

    Exploiting Inter- and Intra-Memory Asymmetries for Data Mapping in Hybrid Tiered-Memories

    Full text link
    Modern computing systems are embracing hybrid memory comprising of DRAM and non-volatile memory (NVM) to combine the best properties of both memory technologies, achieving low latency, high reliability, and high density. A prominent characteristic of DRAM-NVM hybrid memory is that it has NVM access latency much higher than DRAM access latency. We call this inter-memory asymmetry. We observe that parasitic components on a long bitline are a major source of high latency in both DRAM and NVM, and a significant factor contributing to high-voltage operations in NVM, which impact their reliability. We propose an architectural change, where each long bitline in DRAM and NVM is split into two segments by an isolation transistor. One segment can be accessed with lower latency and operating voltage than the other. By introducing tiers, we enable non-uniform accesses within each memory type (which we call intra-memory asymmetry), leading to performance and reliability trade-offs in DRAM-NVM hybrid memory. We extend existing NVM-DRAM OS in three ways. First, we exploit both inter- and intra-memory asymmetries to allocate and migrate memory pages between the tiers in DRAM and NVM. Second, we improve the OS's page allocation decisions by predicting the access intensity of a newly-referenced memory page in a program and placing it to a matching tier during its initial allocation. This minimizes page migrations during program execution, lowering the performance overhead. Third, we propose a solution to migrate pages between the tiers of the same memory without transferring data over the memory channel, minimizing channel occupancy and improving performance. Our overall approach, which we call MNEME, to enable and exploit asymmetries in DRAM-NVM hybrid tiered memory improves both performance and reliability for both single-core and multi-programmed workloads.Comment: 15 pages, 29 figures, accepted at ACM SIGPLAN International Symposium on Memory Managemen

    Dynamic Allocation/Reallocation of Dark Cores in Many-Core Systems for Improved System Performance

    Get PDF
    A significant number of processing cores in any many-core systems nowadays and likely in the future have to be switched off or forced to be idle to become dark cores, in light of ever increasing power density and chip temperature. Although these dark cores cannot make direct contributions to the chip\u27s throughput, they can still be allocated to applications currently running in the system for the sole purpose of heat dissipation enabled by the temperature gradient between the active and dark cores. However, allocating dark cores to applications tends to add extra waiting time to applications yet to be launched, which in return can have adverse implications on the overall system performance. Another big issue related to dark core allocation stems from the fact that application characteristics are prone to undergo rapid changes at runtime, making a fixed dark core allocation scheme less desirable. In this paper, a runtime dark core allocation and dynamic adjustment scheme is thus proposed. Built upon a dynamic programming network (DPN) framework, the proposed scheme attempts to optimize the performance of currently running applications and simultaneously reduce waiting times of incoming applications by taking into account both thermal issues and geometric shapes of regions formed by the active/dark cores. The experimental results show that the proposed approach achieves an average of 61% higher throughput than the two state-of-the-art thermal-aware runtime task mapping approaches, making it the runtime resource management of choice in many-core systems

    Reliability and energy-aware mapping and scheduling of multimedia applications on multiprocessor systems

    No full text
    Lifetime reliability is an emerging concern in multiprocessor systems as escalating power density and hence temperature variation continues to accelerate wear-out leading to a growing prominence of device defects. In this paper, we propose a system-level approach that involves performance-aware mapping of multimedia applications on a multiprocessor system to jointly minimize energy consumption and temperature related wear-out. Fundamental to this approach is a simplified temperature model that incorporates not only the transient and the steady-state behavior (temporal effect), but also the temperature dependency on the surrounding cores (spatial effect). This model is validated against the temperature obtained using the HotSpot tool with transient and steady-state simulations, and is shown to be accurate within 5.5 celsius, leading to an MTTF estimation accuracy of an average 21% with respect to the state-of-the-art approaches. The proposed temperature model is integrated in a gradient-based fast heuristic that controls the voltage and frequency of the cores to limit the average and peak temperature leading to a longer lifetime, simultaneously minimizing the energy consumption. Lifetime computation considers task remapping, which is a common feature available in modern multiprocessor systems. A linear programming approach is then proposed to distribute the cores of a multiprocessor system among concurrent applications to maximize the lifetime. Experiments conducted with a set of synthetic and real-life applications represented as synchronous data flow graphs demonstrate that the proposed approach minimizes energy consumption by an average 24% with 47% increase in lifetime. For concurrent applications, the proposed lifetime-aware core distribution results in an average 10\% improvement in lifetime as compared to performance-based core distribution

    System-level design of energy-efficient sensor-based human activity recognition systems: a model-based approach

    Get PDF
    This thesis contributes an evaluation of state-of-the-art dataflow models of computation regarding their suitability for a model-based design and analysis of human activity recognition systems, in terms of expressiveness and analyzability, as well as model accuracy. Different aspects of state-of-the-art human activity recognition systems have been modeled and analyzed. Based on existing methods, novel analysis approaches have been developed to acquire extra-functional properties like processor utilization, data communication rates, and finally energy consumption of the system
    corecore