241 research outputs found

    Formal Reliability Analysis Using Theorem Proving

    Get PDF
    Reliability analysis has become a tool of fundamental importance to virtually all electrical and computer engineers because of the extensive usage of hardware systems in safety and mission critical domains, such as medicine, military, and transportation. Due to the strong relationship between reliability theory and probabilistic notions, computer simulation techniques have been traditionally used to perform reliability analysis. However, simulation provides less accurate results and cannot handle large-scale systems due to its enormous CPU time requirements. To ensure accurate and complete reliability analysis and thus more reliable hardware designs, we propose to conduct a formal reliability analysis of systems within the sound core of a higher order logic theorem prover (HOL). In this paper, we present the higher order logic formalization of some fundamental reliability theory concepts, which can be built upon to precisely analyze the reliability of various engineering systems. The proposed approach and formalization is then utilized to analyze the repairability conditions for a reconfigurable memory array in the presence of stuck-at and coupling faults

    Formal Reliability Analysis Using Theorem Proving

    Full text link

    Null Convention Logic applications of asynchronous design in nanotechnology and cryptographic security

    Get PDF
    This dissertation presents two Null Convention Logic (NCL) applications of asynchronous logic circuit design in nanotechnology and cryptographic security. The first application is the Asynchronous Nanowire Reconfigurable Crossbar Architecture (ANRCA); the second one is an asynchronous S-Box design for cryptographic system against Side-Channel Attacks (SCA). The following are the contributions of the first application: 1) Proposed a diode- and resistor-based ANRCA (DR-ANRCA). Three configurable logic block (CLB) structures were designed to efficiently reconfigure a given DR-PGMB as one of the 27 arbitrary NCL threshold gates. A hierarchical architecture was also proposed to implement the higher level logic that requires a large number of DR-PGMBs, such as multiple-bit NCL registers. 2) Proposed a memristor look-up-table based ANRCA (MLUT-ANRCA). An equivalent circuit simulation model has been presented in VHDL and simulated in Quartus II. Meanwhile, the comparison between these two ANRCAs have been analyzed numerically. 3) Presented the defect-tolerance and repair strategies for both DR-ANRCA and MLUT-ANRCA. The following are the contributions of the second application: 1) Designed an NCL based S-Box for Advanced Encryption Standard (AES). Functional verification has been done using Modelsim and Field-Programmable Gate Array (FPGA). 2) Implemented two different power analysis attacks on both NCL S-Box and conventional synchronous S-Box. 3) Developed a novel approach based on stochastic logics to enhance the resistance against DPA and CPA attacks. The functionality of the proposed design has been verified using an 8-bit AES S-box design. The effects of decision weight, bitstream length, and input repetition times on error rates have been also studied. Experimental results shows that the proposed approach enhances the resistance to against the CPA attack by successfully protecting the hidden key --Abstract, page iii

    A survey of fault-tolerance algorithms for reconfigurable nano-crossbar arrays

    Get PDF
    ACM Comput. Surv. Volume 50, issue 6 (November 2017)Nano-crossbar arrays have emerged as a promising and viable technology to improve computing performance of electronic circuits beyond the limits of current CMOS. Arrays offer both structural efficiency with reconfiguration and prospective capability of integration with different technologies. However, certain problems need to be addressed, and the most important one is the prevailing occurrence of faults. Considering fault rate projections as high as 20% that is much higher than those of CMOS, it is fair to expect sophisticated fault-tolerance methods. The focus of this survey article is the assessment and evaluation of these methods and related algorithms applied in logic mapping and configuration processes. As a start, we concisely explain reconfigurable nano-crossbar arrays with their fault characteristics and models. Following that, we demonstrate configuration techniques of the arrays in the presence of permanent faults and elaborate on two main fault-tolerance methodologies, namely defect-unaware and defect-aware approaches, with a short review on advantages and disadvantages. For both methodologies, we present detailed experimental results of related algorithms regarding their strengths and weaknesses with a comprehensive yield, success rate and runtime analysis. Next, we overview fault-tolerance approaches for transient faults. As a conclusion, we overview the proposed algorithms with future directions and upcoming challenges.This work is supported by the EU-H2020-RISE project NANOxCOMP no 691178 and the TUBITAK-Career project no 113E760

    Analysis and Test of the Effects of Single Event Upsets Affecting the Configuration Memory of SRAM-based FPGAs

    Get PDF
    SRAM-based FPGAs are increasingly relevant in a growing number of safety-critical application fields, ranging from automotive to aerospace. These application fields are characterized by a harsh radiation environment that can cause the occurrence of Single Event Upsets (SEUs) in digital devices. These faults have particularly adverse effects on SRAM-based FPGA systems because not only can they temporarily affect the behaviour of the system by changing the contents of flip-flops or memories, but they can also permanently change the functionality implemented by the system itself, by changing the content of the configuration memory. Designing safety-critical applications requires accurate methodologies to evaluate the system’s sensitivity to SEUs as early as possible during the design process. Moreover it is necessary to detect the occurrence of SEUs during the system life-time. To this purpose test patterns should be generated during the design process, and then applied to the inputs of the system during its operation. In this thesis we propose a set of software tools that could be used by designers of SRAM-based FPGA safety-critical applications to assess the sensitivity to SEUs of the system and to generate test patterns for in-service testing. The main feature of these tools is that they implement a model of SEUs affecting the configuration bits controlling the logic and routing resources of an FPGA device that has been demonstrated to be much more accurate than the classical stuck-at and open/short models, that are commonly used in the analysis of faults in digital devices. By keeping this accurate fault model into account, the proposed tools are more accurate than similar academic and commercial tools today available for the analysis of faults in digital circuits, that do not take into account the features of the FPGA technology.. In particular three tools have been designed and developed: (i) ASSESS: Accurate Simulator of SEuS affecting the configuration memory of SRAM-based FPGAs, a simulator of SEUs affecting the configuration memory of an SRAM-based FPGA system for the early assessment of the sensitivity to SEUs; (ii) UA2TPG: Untestability Analyzer and Automatic Test Pattern Generator for SEUs Affecting the Configuration Memory of SRAM-based FPGAs, a static analysis tool for the identification of the untestable SEUs and for the automatic generation of test patterns for in-service testing of the 100% of the testable SEUs; and (iii) GABES: Genetic Algorithm Based Environment for SEU Testing in SRAM-FPGAs, a Genetic Algorithm-based Environment for the generation of an optimized set of test patterns for in-service testing of SEUs. The proposed tools have been applied to some circuits from the ITC’99 benchmark. The results obtained from these experiments have been compared with results obtained by similar experiments in which we considered the stuck-at fault model, instead of the more accurate model for SEUs. From the comparison of these experiments we have been able to verify that the proposed software tools are actually more accurate than similar tools today available. In particular the comparison between results obtained using ASSESS with those obtained by fault injection has shown that the proposed fault simulator has an average error of 0:1% and a maximum error of 0:5%, while using a stuck-at fault simulator the average error with respect of the fault injection experiment has been 15:1% with a maximum error of 56:2%. Similarly the comparison between the results obtained using UA2TPG for the accurate SEU model, with the results obtained for stuck-at faults has shown an average difference of untestability of 7:9% with a maximum of 37:4%. Finally the comparison between fault coverages obtained by test patterns generated for the accurate model of SEUs and the fault coverages obtained by test pattern designed for stuck-at faults, shows that the former detect the 100% of the testable faults, while the latter reach an average fault coverage of 78:9%, with a minimum of 54% and a maximum of 93:16%

    Single event upset hardened embedded domain specific reconfigurable architecture

    Get PDF

    Design Disjunction for Resilient Reconfigurable Hardware

    Get PDF
    Contemporary reconfigurable hardware devices have the capability to achieve high performance, power efficiency, and adaptability required to meet a wide range of design goals. With scaling challenges facing current complementary metal oxide semiconductor (CMOS), new concepts and methodologies supporting efficient adaptation to handle reliability issues are becoming increasingly prominent. Reconfigurable hardware and their ability to realize self-organization features are expected to play a key role in designing future dependable hardware architectures. However, the exponential increase in density and complexity of current commercial SRAM-based field-programmable gate arrays (FPGAs) has escalated the overhead associated with dynamic runtime design adaptation. Traditionally, static modular redundancy techniques are considered to surmount this limitation; however, they can incur substantial overheads in both area and power requirements. To achieve a better trade-off among performance, area, power, and reliability, this research proposes design-time approaches that enable fine selection of redundancy level based on target reliability goals and autonomous adaptation to runtime demands. To achieve this goal, three studies were conducted: First, a graph and set theoretic approach, named Hypergraph-Cover Diversity (HCD), is introduced as a preemptive design technique to shift the dominant costs of resiliency to design-time. In particular, union-free hypergraphs are exploited to partition the reconfigurable resources pool into highly separable subsets of resources, each of which can be utilized by the same synthesized application netlist. The diverse implementations provide reconfiguration-based resilience throughout the system lifetime while avoiding the significant overheads associated with runtime placement and routing phases. Evaluation on a Motion-JPEG image compression core using a Xilinx 7-series-based FPGA hardware platform has demonstrated the potential of the proposed FT method to achieve 37.5% area saving and up to 66% reduction in power consumption compared to the frequently-used TMR scheme while providing superior fault tolerance. Second, Design Disjunction based on non-adaptive group testing is developed to realize a low-overhead fault tolerant system capable of handling self-testing and self-recovery using runtime partial reconfiguration. Reconfiguration is guided by resource grouping procedures which employ non-linear measurements given by the constructive property of f-disjunctness to extend runtime resilience to a large fault space and realize a favorable range of tradeoffs. Disjunct designs are created using the mosaic convergence algorithm developed such that at least one configuration in the library evades any occurrence of up to d resource faults, where d is lower-bounded by f. Experimental results for a set of MCNC and ISCAS benchmarks have demonstrated f-diagnosability at the individual slice level with average isolation resolution of 96.4% (94.4%) for f=1 (f=2) while incurring an average critical path delay impact of only 1.49% and area cost roughly comparable to conventional 2-MR approaches. Finally, the proposed Design Disjunction method is evaluated as a design-time method to improve timing yield in the presence of large random within-die (WID) process variations for application with a moderately high production capacity

    Model-checking tool support for quantitative risk analysis and design for safety

    Get PDF
    This paper is concerned with quantitative analysis of tolerance of sensor hardware failures by control system software. The aim is to help the system designer evaluate the efectiveness of risk reduction measures in the system design. This paper proposes an approach for using stochastic model checking to evaluate how likely a given sensor failure mode is to lead to a hazardous system failure, taking control logic and sensor-update timing failures into account. In particular we propose two complementary techniques: one for examining short- term consequences of component failures and the other for examining more subtle longer-term consequences (so-called hidden failures). The techniques overcome scaling issues and yield valuable insights into the relative merits of dierent design decisions. The PRISM model checker is used for stochastic analysis of Continuous Time Markov Chain (CTMC) system models. The approach is illustrated on a case study from manufacturing, involving an industrial metal Press. Although relatively simple, the Press exhibits a wide range of different behaviours, including hidden failures and subtle race conditions

    Toward Biologically-Inspired Self-Healing, Resilient Architectures for Digital Instrumentation and Control Systems and Embedded Devices

    Get PDF
    Digital Instrumentation and Control (I&C) systems in safety-related applications of next generation industrial automation systems require high levels of resilience against different fault classes. One of the more essential concepts for achieving this goal is the notion of resilient and survivable digital I&C systems. In recent years, self-healing concepts based on biological physiology have received attention for the design of robust digital systems. However, many of these approaches have not been architected from the outset with safety in mind, nor have they been targeted for the automation community where a significant need exists. This dissertation presents a new self-healing digital I&C architecture called BioSymPLe, inspired from the way nature responds, defends and heals: the stem cells in the immune system of living organisms, the life cycle of the living cell, and the pathway from Deoxyribonucleic acid (DNA) to protein. The BioSymPLe architecture is integrating biological concepts, fault tolerance techniques, and operational schematics for the international standard IEC 61131-3 to facilitate adoption in the automation industry. BioSymPLe is organized into three hierarchical levels: the local function migration layer from the top side, the critical service layer in the middle, and the global function migration layer from the bottom side. The local layer is used to monitor the correct execution of functions at the cellular level and to activate healing mechanisms at the critical service level. The critical layer is allocating a group of functional B cells which represent the building block that executes the intended functionality of critical application based on the expression for DNA genetic codes stored inside each cell. The global layer uses a concept of embryonic stem cells by differentiating these type of cells to repair the faulty T cells and supervising all repair mechanisms. Finally, two industrial applications have been mapped on the proposed architecture, which are capable of tolerating a significant number of faults (transient, permanent, and hardware common cause failures CCFs) that can stem from environmental disturbances and we believe the nexus of its concepts can positively impact the next generation of critical systems in the automation industry
    corecore