Search CORE

4 research outputs found

Load Value Approximation: Approaching the Ideal Memory Access Latency

Author: Joshua San Miguel
Natalie Enright Jerger
Publication venue
Publication date
Field of study

Approximate computing recognizes that many applications can tolerate inexactness. These applications, which range from multimedia processing to machine learning, operate on inherently noisy and imprecise data. As a result, we can tradeoff some loss in output value integrity for improved processor performance and energy-efficiency. In this paper, we introduce load value approximation. In modern processors, upon a load miss in the private cache, the data must be retrieved from main memory or from the higher-level caches. These data accesses are costly both in terms of latency and energy. We implement load value approximators, which are hardware structures that learn value patterns and generate approximations of the data. The processor can then use these approximate data values to continue executing without incurring the high cost of accessing memory. We show that load value approximators can achieve high coverage while maintaining very low error in the application’s output. By exploiting the approximate nature of applications, we can draw closer to the ideal memory access latency. 1

CiteSeerX

TriCheck: Memory Model Verification at the Trisection of Software, Hardware, and ISA

Author: Collier William W.
Gharachorloo Kourosh
International SPARC
J.
Jackson Daniel
Petri Gustavo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

Memory consistency models (MCMs) which govern inter-module interactions in a shared memory system, are a significant, yet often under-appreciated, aspect of system design. MCMs are defined at the various layers of the hardware-software stack, requiring thoroughly verified specifications, compilers, and implementations at the interfaces between layers. Current verification techniques evaluate segments of the system stack in isolation, such as proving compiler mappings from a high-level language (HLL) to an ISA or proving validity of a microarchitectural implementation of an ISA. This paper makes a case for full-stack MCM verification and provides a toolflow, TriCheck, capable of verifying that the HLL, compiler, ISA, and implementation collectively uphold MCM requirements. The work showcases TriCheck's ability to evaluate a proposed ISA MCM in order to ensure that each layer and each mapping is correct and complete. Specifically, we apply TriCheck to the open source RISC-V ISA, seeking to verify accurate, efficient, and legal compilations from C11. We uncover under-specifications and potential inefficiencies in the current RISC-V ISA documentation and identify possible solutions for each. As an example, we find that a RISC-V-compliant microarchitecture allows 144 outcomes forbidden by C11 to be observed out of 1,701 litmus tests examined. Overall, this paper demonstrates the necessity of full-stack verification for detecting MCM-related bugs in the hardware-software stack.Comment: Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating System

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

Correctly Implementing Value Prediction in Microprocessors that Support Multithreading or Multiprocessing

Author: Daniel J. Sorin
Harold W. Cain
Mark D. Hill
Mikko H. Lipasti
Milo M. K. Martin
Publication venue
Publication date: 01/01/2001
Field of study

This paper explores the interaction of value prediction with thread-level parallelism techniques, including multithreading and multiprocessing, where correctness is defined by a memory consistency model. Value prediction subtly interacts with the memory consistency model by allowing data dependent instructions to be reordered. We find that predicting a value and later verifying that the value eventually calculated is the same as the value predicted is not always sufficient. We present an example of a multithreaded pointer manipulation that can generate a surprising and erroneous result when value prediction is implemented without considering memory consistency correctness. We show that this problem can occur with real software, and we discuss how to apply existing techniques to eliminate the problem in both sequentially consistent systems and systems that obey relaxed memory consistency models

CiteSeerX

ScholarlyCommons@Penn