1,186 research outputs found

    Cross-Layer Optimization for Power-Efficient and Robust Digital Circuits and Systems

    Full text link
    With the increasing digital services demand, performance and power-efficiency become vital requirements for digital circuits and systems. However, the enabling CMOS technology scaling has been facing significant challenges of device uncertainties, such as process, voltage, and temperature variations. To ensure system reliability, worst-case corner assumptions are usually made in each design level. However, the over-pessimistic worst-case margin leads to unnecessary power waste and performance loss as high as 2.2x. Since optimizations are traditionally confined to each specific level, those safe margins can hardly be properly exploited. To tackle the challenge, it is therefore advised in this Ph.D. thesis to perform a cross-layer optimization for digital signal processing circuits and systems, to achieve a global balance of power consumption and output quality. To conclude, the traditional over-pessimistic worst-case approach leads to huge power waste. In contrast, the adaptive voltage scaling approach saves power (25% for the CORDIC application) by providing a just-needed supply voltage. The power saving is maximized (46% for CORDIC) when a more aggressive voltage over-scaling scheme is applied. These sparsely occurred circuit errors produced by aggressive voltage over-scaling are mitigated by higher level error resilient designs. For functions like FFT and CORDIC, smart error mitigation schemes were proposed to enhance reliability (soft-errors and timing-errors, respectively). Applications like Massive MIMO systems are robust against lower level errors, thanks to the intrinsically redundant antennas. This property makes it applicable to embrace digital hardware that trades quality for power savings.Comment: 190 page

    Dependable Embedded Systems

    Get PDF
    This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly within the last five years. This book introduces the most prominent reliability concerns from today’s points of view and roughly recapitulates the progress in the community so far. Unlike other books that focus on a single abstraction level such circuit level or system level alone, the focus of this book is to deal with the different reliability challenges across different levels starting from the physical level all the way to the system level (cross-layer approaches). The book aims at demonstrating how new hardware/software co-design solution can be proposed to ef-fectively mitigate reliability degradation such as transistor aging, processor variation, temperature effects, soft errors, etc. Provides readers with latest insights into novel, cross-layer methods and models with respect to dependability of embedded systems; Describes cross-layer approaches that can leverage reliability through techniques that are pro-actively designed with respect to techniques at other layers; Explains run-time adaptation and concepts/means of self-organization, in order to achieve error resiliency in complex, future many core systems

    A 3-D LUT Design for Transient Error Detection Via Inter-Tier In-Silicon Radiation Sensor

    Get PDF
    Three-dimensional Integrated Circuits (3-D ICs) have gained much attention as a promising approach to increase IC performance due to their several advantages in terms of integration density, power dissipation, and achievable clock frequencies. However, achieving a 3-D ICs resilient to soft errors resulting from radiation effects is a challenging problem. Traditional Radiation-Hardened-by-Design (RHBD) techniques are costly in terms of area, power, and performance overheads. In this work, we propose a new 3-D LUT design integrating error detection capabilities. The LUT has been designed on a two tiers IC model improving radiation resiliency by selective upsizing of sensitive transistors. Besides, an in-silicon radiation sensor adopting inverters chain has been implemented within the free volume of the 3-D structure. The proposed design shows a 37% reduction in sensitivity to SETs and an effective error detection rate of 83% without introducing any area overhead

    Design and Analysis of Soft-Error Resilience Mechanisms for GPU Register File

    Get PDF
    Modern graphics processing units (GPUs) are using increasingly larger register file (RF) which occupies a large fraction of GPU core area and is very frequently access ed. This makes RF vulnerable to soft-errors (SE). In this paper, we present two techniques for improving SE resilience of GPU RF . First, we propose compressing the RF values for reducing the number of vulnerable bits. We leverage value similarity and the presence of narrow-width values to perform compression at warp or thread-level, respectively. Second, we propose sel ective hardening to design a portion of register entry with SE immun e circuits. By collectively using these techniques, higher r esilience can be provided with lower overhead. Without hardening, our warp and thread-level compression techniques bring 47.0% and 40.8% reduction in SE vulnerability, respectively

    An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration

    Get PDF
    We empirically evaluate an undervolting technique, i.e., underscaling the circuit supply voltage below the nominal level, to improve the power-efficiency of Convolutional Neural Network (CNN) accelerators mapped to Field Programmable Gate Arrays (FPGAs). Undervolting below a safe voltage level can lead to timing faults due to excessive circuit latency increase. We evaluate the reliability-power trade-off for such accelerators. Specifically, we experimentally study the reduced-voltage operation of multiple components of real FPGAs, characterize the corresponding reliability behavior of CNN accelerators, propose techniques to minimize the drawbacks of reduced-voltage operation, and combine undervolting with architectural CNN optimization techniques, i.e., quantization and pruning. We investigate the effect of environmental temperature on the reliability-power trade-off of such accelerators. We perform experiments on three identical samples of modern Xilinx ZCU102 FPGA platforms with five state-of-the-art image classification CNN benchmarks. This approach allows us to study the effects of our undervolting technique for both software and hardware variability. We achieve more than 3X power-efficiency (GOPs/W) gain via undervolting. 2.6X of this gain is the result of eliminating the voltage guardband region, i.e., the safe voltage region below the nominal level that is set by FPGA vendor to ensure correct functionality in worst-case environmental and circuit conditions. 43% of the power-efficiency gain is due to further undervolting below the guardband, which comes at the cost of accuracy loss in the CNN accelerator. We evaluate an effective frequency underscaling technique that prevents this accuracy loss, and find that it reduces the power-efficiency gain from 43% to 25%.Comment: To appear at the DSN 2020 conferenc

    Analysis of Single Event Transient Effects in Standard Delay Cells Based on Decoupling Capacitors

    Get PDF
    Single Event Transients (SETs), i.e., voltage glitches induced in combinational logic as a result of the passage of energetic particles, represent an increasingly critical reliability threat for modern complementary metal oxide semiconductor (CMOS) integrated circuits (ICs) employed in space missions. In rad-hard ICs implemented with standard digital cells, special design techniques should be applied to reduce the Soft Error Rate (SER) due to SETs. To this end, it is essential to consider the SET robustness of individual standard cells. Among the wide range of logic cells available in standard cell libraries, the standard delay cells (SDCs) implemented with the skew-sized inverters are exceptionally vulnerable to SETs. Namely, the SET pulses induced in these cells may be hundreds of picoseconds longer than those in other standard cells. In this work, an alternative design of a SDC based on two inverters and two decoupling capacitors is introduced. Electrical simulations have shown that the propagation delay and SET robustness of the proposed delay cell are strongly influenced by the transistor sizes and supply voltage, while the impact of temperature is moderate. The proposed design is more tolerant to SETs than the SDCs with skew-sized inverters, and occupies less area compared to the hardening configurations based on partial and complete duplication. Due to the low transistor count (only six transistors), the proposed delay cell could also be used as a SET filter

    Review of Fault Mitigation Approaches for Deep Neural Networks for Computer Vision in Autonomous Driving

    Get PDF
    The aim of this work is to identify and present challenges and risks related to the employment of DNNs in Computer Vision for Autonomous Driving. Nowadays one of the major technological challenges is to choose the right technology among the abundance that is available on the market. Specifically, in this thesis it is collected a synopsis of the state-of-the-art architectures, techniques and methodologies adopted for building fault-tolerant hardware and ensuring robustness in DNNs-based Computer Vision applications for Autonomous Driving

    X-Rel: Energy-Efficient and Low-Overhead Approximate Reliability Framework for Error-Tolerant Applications Deployed in Critical Systems

    Full text link
    Triple Modular Redundancy (TMR) is one of the most common techniques in fault-tolerant systems, in which the output is determined by a majority voter. However, the design diversity of replicated modules and/or soft errors that are more likely to happen in the nanoscale era may affect the majority voting scheme. Besides, the significant overheads of the TMR scheme may limit its usage in energy consumption and area-constrained critical systems. However, for most inherently error-resilient applications such as image processing and vision deployed in critical systems (like autonomous vehicles and robotics), achieving a given level of reliability has more priority than precise results. Therefore, these applications can benefit from the approximate computing paradigm to achieve higher energy efficiency and a lower area. This paper proposes an energy-efficient approximate reliability (X-Rel) framework to overcome the aforementioned challenges of the TMR systems and get the full potential of approximate computing without sacrificing the desired reliability constraint and output quality. The X-Rel framework relies on relaxing the precision of the voter based on a systematical error bounding method that leverages user-defined quality and reliability constraints. Afterward, the size of the achieved voter is used to approximate the TMR modules such that the overall area and energy consumption are minimized. The effectiveness of employing the proposed X-Rel technique in a TMR structure, for different quality constraints as well as with various reliability bounds are evaluated in a 15-nm FinFET technology. The results of the X-Rel voter show delay, area, and energy consumption reductions of up to 86%, 87%, and 98%, respectively, when compared to those of the state-of-the-art approximate TMR voters.Comment: This paper has been published in IEEE Transactions on Very Large Scale Integration (VLSI) System

    Radiation Tolerant Electronics, Volume II

    Get PDF
    Research on radiation tolerant electronics has increased rapidly over the last few years, resulting in many interesting approaches to model radiation effects and design radiation hardened integrated circuits and embedded systems. This research is strongly driven by the growing need for radiation hardened electronics for space applications, high-energy physics experiments such as those on the large hadron collider at CERN, and many terrestrial nuclear applications, including nuclear energy and safety management. With the progressive scaling of integrated circuit technologies and the growing complexity of electronic systems, their ionizing radiation susceptibility has raised many exciting challenges, which are expected to drive research in the coming decade.After the success of the first Special Issue on Radiation Tolerant Electronics, the current Special Issue features thirteen articles highlighting recent breakthroughs in radiation tolerant integrated circuit design, fault tolerance in FPGAs, radiation effects in semiconductor materials and advanced IC technologies and modelling of radiation effects

    Fault Tolerant Electronic System Design

    Get PDF
    Due to technology scaling, which means reduced transistor size, higher density, lower voltage and more aggressive clock frequency, VLSI devices may become more sensitive against soft errors. Especially for those devices used in safety- and mission-critical applications, dependability and reliability are becoming increasingly important constraints during the development of system on/around them. Other phenomena (e.g., aging and wear-out effects) also have negative impacts on reliability of modern circuits. Recent researches show that even at sea level, radiation particles can still induce soft errors in electronic systems. On one hand, processor-based system are commonly used in a wide variety of applications, including safety-critical and high availability missions, e.g., in the automotive, biomedical and aerospace domains. In these fields, an error may produce catastrophic consequences. Thus, dependability is a primary target that must be achieved taking into account tight constraints in terms of cost, performance, power and time to market. With standards and regulations (e.g., ISO-26262, DO-254, IEC-61508) clearly specify the targets to be achieved and the methods to prove their achievement, techniques working at system level are particularly attracting. On the other hand, Field Programmable Gate Array (FPGA) devices are becoming more and more attractive, also in safety- and mission-critical applications due to the high performance, low power consumption and the flexibility for reconfiguration they provide. Two types of FPGAs are commonly used, based on their configuration memory cell technology, i.e., SRAM-based and Flash-based FPGA. For SRAM-based FPGAs, the SRAM cells of the configuration memory highly susceptible to radiation induced effects which can leads to system failure; and for Flash-based FPGAs, even though their non-volatile configuration memory cells are almost immune to Single Event Upsets induced by energetic particles, the floating gate switches and the logic cells in the configuration tiles can still suffer from Single Event Effects when hit by an highly charged particle. So analysis and mitigation techniques for Single Event Effects on FPGAs are becoming increasingly important in the design flow especially when reliability is one of the main requirements
    corecore