20 research outputs found

    Way Stealing: A Unified Data Cache and Architecturally Visible Storage for Instruction Set Extensions

    No full text
    Way Stealing is a simple architectural modification to a cache-based processor that increases the data bandwidth to and from application-specific instruction set extensions (ISEs), which increase performance and reduce energy consumption. Way Stealing offers higher bandwidth than interfacing the ISEs the processor's register file, and eliminates the need to allocate separate memories called architecturally visible storage (AVS) that are dedicated to the ISEs, and to ensure coherence between the AVS memories and the processor's data cache. Our results show that Way Stealing is competitive in terms of performance and energy consumption with other techniques that use AVS memories in conjunction with a data cache

    Way Stealing: Cache-assisted Automatic Instruction Set Extensions

    No full text
    This paper introduces Way Stealing, a simple architectural modification to a cache-based processor to increase data bandwidth to and from application-specific Instruction Set Extensions (ISEs). Way Stealing provides more bandwidth to the ISE-logic than the register file alone and does not require expensive coherence protocols, as it does not add memory elements to the processor. When enhanced with Way Stealing, ISE identification flows detect more opportunities for acceleration than prior methods; consequently, Way Stealing can accelerate applications to up to 3.7x, whilst reducing the memory sub-system energy consumption by up to 67%, despite data-cache related restrictions

    Optically-Clocked Instruction Set Extensions for High Efficiency Embedded Processors

    No full text
    We propose a technique to localize computation in Instruction Set Extensions (ISEs) that are clocked at very high speed with respect to the processor. In order to save power, data to and from Custom Instruction Units (CIUs) is synchronized via an optical signal that is detected through a Single-Photon Avalanche Diode (SPAD) capable of timing uncertainties as low as 50 ps. The CIUs comprise a free-standing local oscillator serving a computing area of a few tens of square micrometers, thus resulting in extremely reduced power dissipations, since the distribution of a high frequency clock over long distances is avoided. This approach is based on the globally asynchronous locally synchronous concept, whereby the granularity of the local domains is reduced to a minimum, thus enabling extremely high local clock frequencies and low power, while minimizing substrate noise injection and intra-chip interference. Thanks to this approach we can free ourselves from expensive synchronization techniques such as FIFOs, delays, or flip-flop based synchronizers by creating fixed synchronization points in time where data can be exchanged. The paradigm is demonstrated on a chip designed and fabricated in a standard 90 nm CMOS technology. A full characterization demonstrates the suitability of the approach

    Optically-Clocked Instruction Set Extensions for High Efficiency Embedded Processors

    No full text

    Single-Photon Synchronous Detection

    No full text
    Abstract 1 —A novel imaging technique is proposed for fully digital detection of phase and intensity of light. A fully integrated camera implementing the new technique was fabricated in a 0.35μm CMOS technology. When coupled to a modulated light source, the camera can be used to accurately and rapidly reconstruct a 3D scene by evaluating the time-offlight of the light reflected by a target. In passive mode, it allows building differential phase maps of reflection patterns for image enhancement purposes. Tests show the suitability of the technique and confirm phase accuracy predictions. I
    corecore