6 research outputs found
Harnessing resilience: biased voltage overscaling for probabilistic signal processing
A central component of modern computing is the idea that computation requires
determinism. Contrary to this belief, the primary contribution of this work shows that
useful computation can be accomplished in an error-prone fashion. Focusing on low-power
computing and the increasing push toward energy conservation, the work seeks to sacrifice
accuracy in exchange for energy savings.
Probabilistic computing forms the basis for this error-prone computation by diverging from the requirement of determinism and allowing for randomness within computing.
Implemented as probabilistic CMOS (PCMOS), the approach realizes enormous energy sav-
ings in applications that require probability at an algorithmic level. Extending probabilistic
computing to applications that are inherently deterministic, the biased voltage overscaling
(BIVOS) technique presented here constrains the randomness introduced through PCMOS.
Doing so, BIVOS is able to limit the magnitude of any resulting deviations and realizes
energy savings with minimal impact to application quality.
Implemented for a ripple-carry adder, array multiplier, and finite-impulse-response (FIR)
filter; a BIVOS solution substantially reduces energy consumption and does so with im-
proved error rates compared to an energy equivalent reduced-precision solution. When
applied to H.264 video decoding, a BIVOS solution is able to achieve a 33.9% reduction in
energy consumption while maintaining a peak-signal-to-noise ratio of 35.0dB (compared to
14.3dB for a comparable reduced-precision solution).
While the work presented here focuses on a specific technology, the technique realized
through BIVOS has far broader implications. It is the departure from the conventional
mindset that useful computation requires determinism that represents the primary innovation of this work. With applicability to emerging and yet to be discovered technologies,
BIVOS has the potential to contribute to computing in a variety of fashions.PhDCommittee Chair: Anderson, David; Committee Member: Conte, Thomas; Committee Member: Ferri, Bonnie; Committee Member: Hasler, Paul; Committee Member: Mooney, Vincen
Programmable stochastic processors
As traditional approaches for reducing power in microprocessors are being exhausted, extreme power challenges call for unconventional approaches to power reduction. Recent research has shown substantial promise for application-specific stochastic computing, i.e., computing that exploits application error tolerance to enable careful relaxation of correctness guarantees provided by hardware in order to reduce power. This dissertation explores the feasibility, challenges, and potential benefits of stochastic computing in the context of programmable general purpose processors. Specifically, the dissertation describes design-level techniques that minimize the power of a processor for a non-zero error rate or allow a processor to fail gracefully when operated over a range of non-zero error rates. It presents microarchitectural design principles that allow a processor to trade off reliability and energy more efficiently to minimize energy when exploiting error resilience. It demonstrates the benefit of using compiler optimizations that optimize a binary to enable more energy savings when operating at a non-zero error rate. It also demonstrates significant benefits for a programmable stochastic processor prototype that improves energy efficiency by carefully relaxing correctness and exposing errors in applications running on a commodity processor. This dissertation on programmable stochastic processors conclusively shows that the architecture and design of processors and applications should be approached differently in scenarios where errors are allowed to be exposed from the hardware to higher levels of the compute stack. Significant energy benefits are demonstrated for design-, architecture-, compiler-, and application-level optimizations for general purpose programmable stochastic processors
Recommended from our members
Enabling high-performance, mixed-signal approximate computing
textFor decades, the semiconductor industry enjoyed exponential improvements in microprocessor power and performance with the device scaling of successive technology generations. Scaling limitations at sub-micron technologies, however, have ceased to provide these historical performance improvements within a limited power budget. While device scaling provides a larger number of transistors per chip, for the same chip area, a growing percentage of the chip will have to be powered off at any given time due to power constraints. As such, the architecture community has focused on energy-efficient designs and is looking to specialized hardware to provide gains in performance. A focus on energy efficiency, along with increasingly less reliable transistors due to device scaling, has led to research in the area of approximate computing, where accuracy is traded for energy efficiency when precise computation is not required. There is a growing body of approximation-tolerant applications that, for example, compute on noisy or incomplete data, such as real-world sensor inputs, or make approximations to decrease the computation load in the analysis of cumbersome data sets. These approximation-tolerant applications span application domains, such as machine learning, image processing, robotics, and financial analysis, among others. Since the advent of the modern processor, computing models have largely presumed the attribute of accuracy. A willingness to relax accuracy requirements, however, with goal of gaining energy efficiency, warrants the re-investigation of the potential of analog computing. Analog hardware offers the opportunity for fast and low-power computation; however, it presents challenges in the form of accuracy. Where analog compute blocks have been applied to solve fixed-function problems, general-purpose computing has relied on digital hardware implementations that provide generality and programmability. The work presented in this thesis aims to answer the following questions: Can analog circuits be successfully integrated into general-purpose computing to provide performance and energy savings? And, what is required to address the historical analog challenges of inaccuracy, programmability, and a lack of generality to enable such an approach? This thesis work investigates a neural approach as a means to address the historical analog challenges of inaccuracy, programmability, and generality and to enable the use of analog circuits in general-purpose, high-performance computing. The first piece of this thesis work investigates the use of analog circuits at the microarchitecture level in the form of an analog neural branch predictor. The task of branch prediction can tolerate imprecision, as roll-back mechanisms correct for branch mispredictions, and application-level accuracy remains unaffected. We show that analog circuits enable the implementation of a highly-accurate, neural-prediction algorithm that is infeasible to implement in the digital domain. The second piece of this thesis work presents a neural accelerator that targets approximation-tolerant code. Analog neural acceleration provides application speedup of 3.3x and energy savings of 12.1x with a quality loss less than 10% for all except one approximation-tolerant benchmark. These results show that, using a neural approach, analog circuits can be applied to provide performance and energy efficiency in high-performance, general-purpose computing.Computer Science
Dependable Embedded Systems
This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly within the last five years. This book introduces the most prominent reliability concerns from today’s points of view and roughly recapitulates the progress in the community so far. Unlike other books that focus on a single abstraction level such circuit level or system level alone, the focus of this book is to deal with the different reliability challenges across different levels starting from the physical level all the way to the system level (cross-layer approaches). The book aims at demonstrating how new hardware/software co-design solution can be proposed to ef-fectively mitigate reliability degradation such as transistor aging, processor variation, temperature effects, soft errors, etc. Provides readers with latest insights into novel, cross-layer methods and models with respect to dependability of embedded systems; Describes cross-layer approaches that can leverage reliability through techniques that are pro-actively designed with respect to techniques at other layers; Explains run-time adaptation and concepts/means of self-organization, in order to achieve error resiliency in complex, future many core systems
Energy-Centric Scheduling for Real-Time Systems
Energy consumption is today an important design issue for all kinds of digital systems, and essential for the battery operated ones. An important fraction of this energy is dissipated on the processors running the application software. To reduce this energy consumption, one may, for instance, lower the processor clock frequency and supply voltage. This, however, might lead to a performance degradation of the whole system. In real-time systems, the crucial issue is timing, which is directly dependent on the system speed. Real-time scheduling and energy efficiency are therefore tightly connected issues, being addressed together in this work. Several scheduling approaches for low energy are described in the thesis, most targeting variable speed processor architectures. At task level, a novel speed scheduling algorithm for tasks with probabilistic execution pattern is introduced and compared to an already existing compile-time approach. For task graphs, a list-scheduling based algorithm with an energy-sensitive priority is proposed. For task sets, off-line methods for computing the task maximum required speeds are described, both for rate-monotonic and earliest deadline first scheduling. Also, a run-time speed optimization policy based on slack re-distribution is proposed for rate-monotonic scheduling. Next, an energy-efficient extension of the earliest deadline first priority assignment policy is proposed, aimed at tasks with probabilistic execution time. Finally, scheduling is examined in conjunction with assignment of tasks to processors, as parts of various low energy design flows. For some of the algorithms given in the thesis, energy measurements were carried out on a real hardware platform containing a variable speed processor. The results confirm the validity of the initial assumptions and models used throughout the thesis. These experiments also show the efficiency of the newly introduced scheduling methods