4 research outputs found

    Soft-Error Resilience Framework For Reliable and Energy-Efficient CMOS Logic and Spintronic Memory Architectures

    Get PDF
    The revolution in chip manufacturing processes spanning five decades has proliferated high performance and energy-efficient nano-electronic devices across all aspects of daily life. In recent years, CMOS technology scaling has realized billions of transistors within large-scale VLSI chips to elevate performance. However, these advancements have also continually augmented the impact of Single-Event Transient (SET) and Single-Event Upset (SEU) occurrences which precipitate a range of Soft-Error (SE) dependability issues. Consequently, soft-error mitigation techniques have become essential to improve systems\u27 reliability. Herein, first, we proposed optimized soft-error resilience designs to improve robustness of sub-micron computing systems. The proposed approaches were developed to deliver energy-efficiency and tolerate double/multiple errors simultaneously while incurring acceptable speed performance degradation compared to the prior work. Secondly, the impact of Process Variation (PV) at the Near-Threshold Voltage (NTV) region on redundancy-based SE-mitigation approaches for High-Performance Computing (HPC) systems was investigated to highlight the approach that can realize favorable attributes, such as reduced critical datapath delay variation and low speed degradation. Finally, recently, spin-based devices have been widely used to design Non-Volatile (NV) elements such as NV latches and flip-flops, which can be leveraged in normally-off computing architectures for Internet-of-Things (IoT) and energy-harvesting-powered applications. Thus, in the last portion of this dissertation, we design and evaluate for soft-error resilience NV-latching circuits that can achieve intriguing features, such as low energy consumption, high computing performance, and superior soft errors tolerance, i.e., concurrently able to tolerate Multiple Node Upset (MNU), to potentially become a mainstream solution for the aerospace and avionic nanoelectronics. Together, these objectives cooperate to increase energy-efficiency and soft errors mitigation resiliency of larger-scale emerging NV latching circuits within iso-energy constraints. In summary, addressing these reliability concerns is paramount to successful deployment of future reliable and energy-efficient CMOS logic and spintronic memory architectures with deeply-scaled devices operating at low-voltages

    Heterogeneous Reconfigurable Fabrics for In-circuit Training and Evaluation of Neuromorphic Architectures

    Get PDF
    A heterogeneous device technology reconfigurable logic fabric is proposed which leverages the cooperating advantages of distinct magnetic random access memory (MRAM)-based look-up tables (LUTs) to realize sequential logic circuits, along with conventional SRAM-based LUTs to realize combinational logic paths. The resulting Hybrid Spin/Charge FPGA (HSC-FPGA) using magnetic tunnel junction (MTJ) devices within this topology demonstrates commensurate reductions in area and power consumption over fabrics having LUTs constructed with either individual technology alone. Herein, a hierarchical top-down design approach is used to develop the HSCFPGA starting from the configurable logic block (CLB) and slice structures down to LUT circuits and the corresponding device fabrication paradigms. This facilitates a novel architectural approach to reduce leakage energy, minimize communication occurrence and energy cost by eliminating unnecessary data transfer, and support auto-tuning for resilience. Furthermore, HSC-FPGA enables new advantages of technology co-design which trades off alternative mappings between emerging devices and transistors at runtime by allowing dynamic remapping to adaptively leverage the intrinsic computing features of each device technology. HSC-FPGA offers a platform for fine-grained Logic-In-Memory architectures and runtime adaptive hardware. An orthogonal dimension of fabric heterogeneity is also non-determinism enabled by either low-voltage CMOS or probabilistic emerging devices. It can be realized using probabilistic devices within a reconfigurable network to blend deterministic and probabilistic computational models. Herein, consider the probabilistic spin logic p-bit device as a fabric element comprising a crossbar-structured weighted array. The Programmability of the resistive network interconnecting p-bit devices can be achieved by modifying the resistive states of the array\u27s weighted connections. Thus, the programmable weighted array forms a CLB-scale macro co-processing element with bitstream programmability. This allows field programmability for a wide range of classification problems and recognition tasks to allow fluid mappings of probabilistic and deterministic computing approaches. In particular, a Deep Belief Network (DBN) is implemented in the field using recurrent layers of co-processing elements to form an n x m1 x m2 x ::: x mi weighted array as a configurable hardware circuit with an n-input layer followed by i ≥ 1 hidden layers. As neuromorphic architectures using post-CMOS devices increase in capability and network size, the utility and benefits of reconfigurable fabrics of neuromorphic modules can be anticipated to continue to accelerate

    Efficient Evaluation of Probability and Reliability with Digital Integrated Circuits

    Get PDF
    As complementary metal–oxide–semiconductor (CMOS) devices shrink to nanoscale, digital integrated circuits (ICs) are more susceptible to various environmental parameters, such as temperature, supply voltage, wiring, noise, and fabrication process variations. This would reduce the circuit operation reliability (i.e., the probability that a circuit or component is performing its intended logic function). Signal probability (the probability that a digital signal is producing logic 1) is another factor that measures circuit’s dynamic behavior and power dissipation. Research shows that signal probability and reliability within ICs may interact with each other in a complicated way. Generally speaking, as signal probability changes due to input probability variations, so does the signal reliability, and vice versa. This motivates simultaneous evaluation of both for digital ICs towards their performance improvement. However, this evaluation could be a challenge especially for large-scale circuits, due to signal correlations caused by reconvergent fanouts within circuits. Out of two existing evaluation methods, i.e., numerical and analytical methods, the former can give high accuracy level at the cost of expensive computation, while the latter does exactly the opposite. This thesis provides a hybrid solution by taking advantage of both numerical and analytical methods to achieve fast and accurate evaluation for signal probability and reliability for ICs (including both combinational and sequential circuits). First, we develop a categorization-based analytical model for combinational circuits to deal with a variety of signal correlations. For strongly correlated or independent cases, analytical solutions are applied for accurate results. For cases with moderate correlation strength, we use local bitstream simulations for fast estimation. Our simulation results show that the proposed method is hundreds of times faster than Monte-Carlo (MC) simulation, while keeping almost same level of accuracy. We then extend the above method to sequential circuits (with finite-state-machine model) for probability and reliability evaluation. Since sequential circuits can be viewed as an unfolded network of combinational logic, our focus is on how both probability and reliability converge to a final stable state over a certain number of cycles/iterations. To improve the efficiency of this convergence process, we propose a two-step-convergence (TSC) model instead of using traditional step-size based convergence. Simulation results show that the proposed method speeds up the process by around 30% on average compared to traditional method while maintaining a high level of accuracy. Finally, we study the impact of device aging on circuit reliability. After years of operation, CMOS (especially PMOS) devices would experience an increase in their threshold voltage, a phenomenon called Negative Bias Temperature Instability (NBTI). This aging effect leads to the increased gate delay with late arrival time of signals, making circuits temporally unreliable. Threshold voltage changes may also negatively affect the probability that transistors perform intended logical operations, causing them spatially more unreliable. Our investigation focuses on evaluation of the overall reliability at circuit-level by considering both spatial (solely considering the correctness of signal logic values) and temporal (considering the signal arrival time to catch up sampling action) aspects of it. This would help circuit designers predict the circuit lifetime. Simulations on benchmark circuits show that the reliability degradation rate due to aging effect ranges from 1.5% to 8.2% over one-year period, depending on specific circuits

    Dependable Embedded Systems

    Get PDF
    This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly within the last five years. This book introduces the most prominent reliability concerns from today’s points of view and roughly recapitulates the progress in the community so far. Unlike other books that focus on a single abstraction level such circuit level or system level alone, the focus of this book is to deal with the different reliability challenges across different levels starting from the physical level all the way to the system level (cross-layer approaches). The book aims at demonstrating how new hardware/software co-design solution can be proposed to ef-fectively mitigate reliability degradation such as transistor aging, processor variation, temperature effects, soft errors, etc. Provides readers with latest insights into novel, cross-layer methods and models with respect to dependability of embedded systems; Describes cross-layer approaches that can leverage reliability through techniques that are pro-actively designed with respect to techniques at other layers; Explains run-time adaptation and concepts/means of self-organization, in order to achieve error resiliency in complex, future many core systems
    corecore