thesis

Modular Architectures And Optimization Techniques For Power And Reliability In Future Many Core Microprocessors

Abstract

Power and reliability issues are expected to increase in future multicore systems with a higher degree of component integration. As the feature sizes of transistors continue to shrink, more resources can be incorporated in microprocessors to address a broader spectrum of different application requirements. However, power constraints will limit the amount of resources that can be powered on at any given time. Recent studies have shown that future multicore systems will be able to power on less than 80% of their transistors in the near future, and less than 50% in the long term. The most difficult challenge is deciding which transistors should be powered on at any given time to deliver high performance under strict power constraints. At the same time, device reliability issues - the proliferation of devices that will either be defective at manufacturing time or will fail in the field with usage - are projected to be exacerbated by the continued scaling of device sizes. We present a modular, dynamically reconfigurable architecture as a promising unified solution to the problems of dark silicon (the inability to power all available computing resources) and reliability. Our modular architecture implements deconfigurable lanes within the decoupled sections of a superscalar pipeline that can be easily powered on or off to isolate faults or create an energy-efficient hardware configuration tailored to the needs of the running software. At the system level, we propose a novel framework that uses surrogate response surfaces and heuristic global optimization algorithms to characterize the behavior of applications at runtime and dynamically redistribute the available chip-wide power to obtain hardware configurations customized for the software diversity and system goals. Our reconfigurable architecture is able to provide high performance under a strict power budget, maintain a certain performance level at a reduced power cost, and in the case of hard faults, restore the system's performance to pre-fault levels

    Similar works

    Full text

    thumbnail-image

    Available Versions