5 research outputs found

    Towards Energy-Efficient and Reliable Computing: From Highly-Scaled CMOS Devices to Resistive Memories

    Get PDF
    The continuous increase in transistor density based on Moore\u27s Law has led us to highly scaled Complementary Metal-Oxide Semiconductor (CMOS) technologies. These transistor-based process technologies offer improved density as well as a reduction in nominal supply voltage. An analysis regarding different aspects of 45nm and 15nm technologies, such as power consumption and cell area to compare these two technologies is proposed on an IEEE 754 Single Precision Floating-Point Unit implementation. Based on the results, using the 15nm technology offers 4-times less energy and 3-fold smaller footprint. New challenges also arise, such as relative proportion of leakage power in standby mode that can be addressed by post-CMOS technologies. Spin-Transfer Torque Random Access Memory (STT-MRAM) has been explored as a post-CMOS technology for embedded and data storage applications seeking non-volatility, near-zero standby energy, and high density. Towards attaining these objectives for practical implementations, various techniques to mitigate the specific reliability challenges associated with STT-MRAM elements are surveyed, classified, and assessed herein. Cost and suitability metrics assessed include the area of nanomagmetic and CMOS components per bit, access time and complexity, Sense Margin (SM), and energy or power consumption costs versus resiliency benefits. In an attempt to further improve the Process Variation (PV) immunity of the Sense Amplifiers (SAs), a new SA has been introduced called Adaptive Sense Amplifier (ASA). ASA can benefit from low Bit Error Rate (BER) and low Energy Delay Product (EDP) by combining the properties of two of the commonly used SAs, Pre-Charge Sense Amplifier (PCSA) and Separated Pre-Charge Sense Amplifier (SPCSA). ASA can operate in either PCSA or SPCSA mode based on the requirements of the circuit such as energy efficiency or reliability. Then, ASA is utilized to propose a novel approach to actually leverage the PV in Non-Volatile Memory (NVM) arrays using Self-Organized Sub-bank (SOS) design. SOS engages the preferred SA alternative based on the intrinsic as-built behavior of the resistive sensing timing margin to reduce the latency and power consumption while maintaining acceptable access time

    Tiered algorithm for distributed process quiescence and termination detection

    No full text
    The Tiered Algorithm is presented for time- efficient and message- efficient detection of process termination. It employs a global invariant of equality between process production and consumption at each level of process nesting to detect termination, regardless of execution interleaving order and network transit time. Correctness is validated for arbitrary process launching hierarchies, including launch- in- transit hazards, where processes are created dynamically based on runtime conditions for remote execution. The performance of the Tiered Algorithm is compared to three existing schemes with comparable capabilities, namely, the Chandrasekaran and Venkatesan ( CV), Lai, Tseng, and Dong ( LTD), and Credit termination detection algorithms. For synchronization of T tasks terminating in E epochs of idle processing, the Tiered Algorithm is shown to incur O(E) message count complexity and O(T lg T) message bit complexity while incurring detection latency corresponding to only integer addition and comparison. The synchronization performance in terms of message overhead, detection operations, and storage requirements are evaluated and compared across numerous task creation and termination hierarchies

    Tiered Algorithm for Distributed Process Quiescence and Termination Detection

    No full text
    Abstract⎯The Tiered Algorithm is presented for time-efficient and message-efficient detection of process termination. It employs a global invariant of equality between process production and consumption at each level of process nesting to detect termination regardless of execution interleaving order and network transit time. Correctness is validated for arbitrary process launching hierarchies, including launch-in-transit hazards where processes are created dynamically based on run-time conditions for remote execution. The performance of the Tiered Algorithm is compared to three existing schemes with comparable capabilities, namely the CV, LTD, and Credit termination detection algorithms. For synchronization of T tasks terminating in E epochs of idle processing, the Tiered Algorithm is shown to incur O(E) message count complexity and O(T lg T) message bit complexity while incurring detection latency corresponding to only integer addition and comparison. The synchronization performance in terms of messaging overhead, detection operations, and storage requirements are evaluated and compared across numerous task creation and termination hierarchies

    Tiered Algorithm For Distributed Process Quiescence And Termination Detection

    No full text
    The Tiered Algorithm is presented for time-efficient and message-efficient detection of process termination. It employs a global invariant of equality between process production and consumption at each level of process nesting to detect termination regardless of execution interleaving order and network transit time. Correctness is validated for arbitrary process launching hierarchies, including launch-in-transit hazards where processes are created dynamically based on run-time conditions for remote execution. The performance of the Tiered Algorithm is compared to three existing schemes with comparable capabilities, namely the CV, LTD, and Credit termination detection algorithms. For synchronization of T tasks terminating in E epochs of idle processing, the Tiered Algorithm is shown to incur O(E) message count complexity and O(T lg T) message bit complexity while incurring detection latency corresponding to only integer addition and comparison. The synchronization performance in terms of messaging overhead, detection operations, and storage requirements are evaluated and compared across numerous task creation and termination hierarchies. © 2007 IEEE
    corecore