5 research outputs found

    AN INFORMATIVE PIPELINING WITH SCHEDULING REGULATOR TO SUPPORT RECOVERY

    Get PDF
    Within this work, we apply Razor to hardware accelerators that find growing application in System-on-Nick designs rich in-performance needs that must definitely be delivered under stringent power budgets. We exploit these traits usual for DSP and image-processing accelerators to apply Razor recovery in manner that's amenable to RTL validation and verification. We describe the implementation and plastic measurement is a result of a Razor-based hardware loop-accelerator (RZLA), applying the Sobel edge-recognition formula. Unlike microprocessors, the RZLA pipeline is data path-dominated with statically-scheduled control which has queue-based storage structures that are simply extended to aid check-pointing and recovery. The RFF is deployed along with an amount-sensitive latch-insertion based formula to deal with the minimum-delay constraint contained in all Razor systems. This formula enables using the time period for timing speculation resulting in robust error recognition and correction across a large dynamic current- and frequency-scaling range

    An FPGA-based infrastructure for fine-grained DVFS analysis in high-performance embedded systems

    Get PDF
    Emerging technologies provide SoCs with fine-grained DVFS capabilities both in space (number of domains) and time (transients in the order of tens of nanoseconds). Analyzing these systems requires cycle-accurate accounting of rapidly-changing dynamics and complex interactions among accelerators, interconnect, memory, and OS. We present an FPGA-based infrastructure that facilitates such analyses for high-performance embedded systems. We show how our infrastructure can be used to first generate SoCs with loosely-coupled accelerators, and then perform design-space exploration considering several DVFS policies under full-system workload scenarios, sweeping spatial and temporal domain granularity

    Timing-Error Tolerance Techniques for Low-Power DSP: Filters and Transforms

    Get PDF
    Low-power Digital Signal Processing (DSP) circuits are critical to commercial System-on-Chip design for battery powered devices. Dynamic Voltage Scaling (DVS) of digital circuits can reclaim worst-case supply voltage margins for delay variation, reducing power consumption. However, removing static margins without compromising robustness is tremendously challenging, especially in an era of escalating reliability concerns due to continued process scaling. The Razor DVS scheme addresses these concerns, by ensuring robustness using explicit timing-error detection and correction circuits. Nonetheless, the design of low-complexity and low-power error correction is often challenging. In this thesis, the Razor framework is applied to fixed-precision DSP filters and transforms. The inherent error tolerance of many DSP algorithms is exploited to achieve very low-overhead error correction. Novel error correction schemes for DSP datapaths are proposed, with very low-overhead circuit realisations. Two new approximate error correction approaches are proposed. The first is based on an adapted sum-of-products form that prevents errors in intermediate results reaching the output, while the second approach forces errors to occur only in less significant bits of each result by shaping the critical path distribution. A third approach is described that achieves exact error correction using time borrowing techniques on critical paths. Unlike previously published approaches, all three proposed are suitable for high clock frequency implementations, as demonstrated with fully placed and routed FIR, FFT and DCT implementations in 90nm and 32nm CMOS. Design issues and theoretical modelling are presented for each approach, along with SPICE simulation results demonstrating power savings of 21 – 29%. Finally, the design of a baseband transmitter in 32nm CMOS for the Spectrally Efficient FDM (SEFDM) system is presented. SEFDM systems offer bandwidth savings compared to Orthogonal FDM (OFDM), at the cost of increased complexity and power consumption, which is quantified with the first VLSI architecture

    DVFS in loop accelerators using BLADES

    No full text
    Hardware accelerators are common in embedded systems that have high performance requirements but must still operate within stringent energy constraints. To facilitate short time-to-market and reduced non-recurring engineering costs, automatic systems that can rapidly generate hardware bearing both power and performance in mind are extremely attractive. This paper proposes the BLADES (Better-than-worst-case Loop Accelerator Design) system for automatically designing self-tuning hardware accelerators that dynamically select their best operating frequency and voltage based on environmental conditions, silicon variation, and input data characteristics. Errors in operation are detected by Razor flip-flops, and recovery is initiated. The architecture efficiently supports detection, rollback, and recovery to provide a highly adaptable and configurable loop accelerator. The overhead of deploying Razor flip-flops is significantly reduced by automatically chaining primitive computation operations together. Results on a range of loop accelerators show average energy savings of 32% gained by voltage scaling below the nominal supply voltage
    corecore