Search CORE

315 research outputs found

Transient error mitigation by means of approximate logic circuits

Author: Sánchez Clemente Antonio José
Publication venue
Publication date: 01/01/2017
Field of study

Mención Internacional en el título de doctorThe technological advances in the manufacturing of electronic circuits have allowed to greatly improve their performance, but they have also increased the sensitivity of electronic devices to radiation-induced errors. Among them, the most common effects are the SEEs, i.e., electrical perturbations provoked by the strike of high-energy particles, which may modify the internal state of a memory element (SEU) or generate erroneous transient pulses (SET), among other effects. These events pose a threat for the reliability of electronic circuits, and therefore fault-tolerance techniques must be applied to deal with them. The most common fault-tolerance techniques are based in full replication (DWC or TMR). These techniques are able to cover a wide range of failure mechanisms present in electronic circuits. However, they suffer from high overheads in terms of area and power consumption. For this reason, lighter alternatives are often sought at the expense of slightly reducing reliability for the least critical circuit sections. In this context a new paradigm of electronic design is emerging, known as approximate computing, which is based on improving the circuit performance in change of slight modifications of the intended functionality. This is an interesting approach for the design of lightweight fault-tolerant solutions, which has not been yet studied in depth. The main goal of this thesis consists in developing new lightweight fault-tolerant techniques with partial replication, by means of approximate logic circuits. These circuits can be designed with great flexibility. This way, the level of protection as well as the overheads can be adjusted at will depending on the necessities of each application. However, finding optimal approximate circuits for a given application is still a challenge. In this thesis a method for approximate circuit generation is proposed, denoted as fault approximation, which consists in assigning constant logic values to specific circuit lines. On the other hand, several criteria are developed to generate the most suitable approximate circuits for each application, by using this fault approximation mechanism. These criteria are based on the idea of approximating the least testable sections of circuits, which allows reducing overheads while minimising the loss of reliability. Therefore, in this thesis the selection of approximations is linked to testability measures. The first criterion for fault selection developed in this thesis uses static testability measures. The approximations are generated from the results of a fault simulation of the target circuit, and from a user-specified testability threshold. The amount of approximated faults depends on the chosen threshold, which allows to generate approximate circuits with different performances. Although this approach was initially intended for combinational circuits, an extension to sequential circuits has been performed as well, by considering the flip-flops as both inputs and outputs of the combinational part of the circuit. The experimental results show that this technique achieves a wide scalability, and an acceptable trade-off between reliability versus overheads. In addition, its computational complexity is very low. However, the selection criterion based in static testability measures has some drawbacks. Adjusting the performance of the generated approximate circuits by means of the approximation threshold is not intuitive, and the static testability measures do not take into account the changes as long as faults are approximated. Therefore, an alternative criterion is proposed, which is based on dynamic testability measures. With this criterion, the testability of each fault is computed by means of an implication-based probability analysis. The probabilities are updated with each new approximated fault, in such a way that on each iteration the most beneficial approximation is chosen, that is, the fault with the lowest probability. In addition, the computed probabilities allow to estimate the level of protection against faults that the generated approximate circuits provide. Therefore, it is possible to generate circuits which stick to a target error rate. By modifying this target, circuits with different performances can be obtained. The experimental results show that this new approach is able to stick to the target error rate with reasonably good precision. In addition, the approximate circuits generated with this technique show better performance than with the approach based in static testability measures. In addition, the fault implications have been reused too in order to implement a new type of logic transformation, which consists in substituting functionally similar nodes. Once the fault selection criteria have been developed, they are applied to different scenarios. First, an extension of the proposed techniques to FPGAs is performed, taking into account the particularities of this kind of circuits. This approach has been validated by means of radiation experiments, which show that a partial replication with approximate circuits can be even more robust than a full replication approach, because a smaller area reduces the probability of SEE occurrence. Besides, the proposed techniques have been applied to a real application circuit as well, in particular to the microprocessor ARM Cortex M0. A set of software benchmarks is used to generate the required testability measures. Finally, a comparative study of the proposed approaches with approximate circuit generation by means of evolutive techniques have been performed. These approaches make use of a high computational capacity to generate multiple circuits by trial-and-error, thus reducing the possibility of falling into local minima. The experimental results demonstrate that the circuits generated with evolutive approaches are slightly better in performance than the circuits generated with the techniques here proposed, although with a much higher computational effort. In summary, several original fault mitigation techniques with approximate logic circuits are proposed. These approaches are demonstrated in various scenarios, showing that the scalability and adaptability to the requirements of each application are their main virtuesLos avances tecnológicos en la fabricación de circuitos electrónicos han permitido mejorar en gran medida sus prestaciones, pero también han incrementado la sensibilidad de los mismos a los errores provocados por la radiación. Entre ellos, los más comunes son los SEEs, perturbaciones eléctricas causadas por el impacto de partículas de alta energía, que entre otros efectos pueden modificar el estado de los elementos de memoria (SEU) o generar pulsos transitorios de valor erróneo (SET). Estos eventos suponen un riesgo para la fiabilidad de los circuitos electrónicos, por lo que deben ser tratados mediante técnicas de tolerancia a fallos. Las técnicas de tolerancia a fallos más comunes se basan en la replicación completa del circuito (DWC o TMR). Estas técnicas son capaces de cubrir una amplia variedad de modos de fallo presentes en los circuitos electrónicos. Sin embargo, presentan un elevado sobrecoste en área y consumo. Por ello, a menudo se buscan alternativas más ligeras, aunque no tan efectivas, basadas en una replicación parcial. En este contexto surge una nueva filosofía de diseño electrónico, conocida como computación aproximada, basada en mejorar las prestaciones de un diseño a cambio de ligeras modificaciones de la funcionalidad prevista. Es un enfoque atractivo y poco explorado para el diseño de soluciones ligeras de tolerancia a fallos. El objetivo de esta tesis consiste en desarrollar nuevas técnicas ligeras de tolerancia a fallos por replicación parcial, mediante el uso de circuitos lógicos aproximados. Estos circuitos se pueden diseñar con una gran flexibilidad. De este forma, tanto el nivel de protección como el sobrecoste se pueden regular libremente en función de los requisitos de cada aplicación. Sin embargo, encontrar los circuitos aproximados óptimos para cada aplicación es actualmente un reto. En la presente tesis se propone un método para generar circuitos aproximados, denominado aproximación de fallos, consistente en asignar constantes lógicas a ciertas líneas del circuito. Por otro lado, se desarrollan varios criterios de selección para, mediante este mecanismo, generar los circuitos aproximados más adecuados para cada aplicación. Estos criterios se basan en la idea de aproximar las secciones menos testables del circuito, lo que permite reducir los sobrecostes minimizando la perdida de fiabilidad. Por tanto, en esta tesis la selección de aproximaciones se realiza a partir de medidas de testabilidad. El primer criterio de selección de fallos desarrollado en la presente tesis hace uso de medidas de testabilidad estáticas. Las aproximaciones se generan a partir de los resultados de una simulación de fallos del circuito objetivo, y de un umbral de testabilidad especificado por el usuario. La cantidad de fallos aproximados depende del umbral escogido, lo que permite generar circuitos aproximados con diferentes prestaciones. Aunque inicialmente este método ha sido concebido para circuitos combinacionales, también se ha realizado una extensión a circuitos secuenciales, considerando los biestables como entradas y salidas de la parte combinacional del circuito. Los resultados experimentales demuestran que esta técnica consigue una buena escalabilidad, y unas prestaciones de coste frente a fiabilidad aceptables. Además, tiene un coste computacional muy bajo. Sin embargo, el criterio de selección basado en medidas estáticas presenta algunos inconvenientes. No resulta intuitivo ajustar las prestaciones de los circuitos aproximados a partir de un umbral de testabilidad, y las medidas estáticas no tienen en cuenta los cambios producidos a medida que se van aproximando fallos. Por ello, se propone un criterio alternativo de selección de fallos, basado en medidas de testabilidad dinámicas. Con este criterio, la testabilidad de cada fallo se calcula mediante un análisis de probabilidades basado en implicaciones. Las probabilidades se actualizan con cada nuevo fallo aproximado, de forma que en cada iteración se elige la aproximación más favorable, es decir, el fallo con menor probabilidad. Además, las probabilidades calculadas permiten estimar la protección frente a fallos que ofrecen los circuitos aproximados generados, por lo que es posible generar circuitos que se ajusten a una tasa de fallos objetivo. Modificando esta tasa se obtienen circuitos aproximados con diferentes prestaciones. Los resultados experimentales muestran que este método es capaz de ajustarse razonablemente bien a la tasa de fallos objetivo. Además, los circuitos generados con esta técnica muestran mejores prestaciones que con el método basado en medidas estáticas. También se han aprovechado las implicaciones de fallos para implementar un nuevo tipo de transformación lógica, consistente en sustituir nodos funcionalmente similares. Una vez desarrollados los criterios de selección de fallos, se aplican a distintos campos. En primer lugar, se hace una extensión de las técnicas propuestas para FPGAs, teniendo en cuenta las particularidades de este tipo de circuitos. Esta técnica se ha validado mediante experimentos de radiación, los cuales demuestran que una replicación parcial con circuitos aproximados puede ser incluso más robusta que una replicación completa, ya que un área más pequeña reduce la probabilidad de SEEs. Por otro lado, también se han aplicado las técnicas propuestas en esta tesis a un circuito de aplicación real, el microprocesador ARM Cortex M0, utilizando un conjunto de benchmarks software para generar las medidas de testabilidad necesarias. Por ´último, se realiza un estudio comparativo de las técnicas desarrolladas con la generación de circuitos aproximados mediante técnicas evolutivas. Estas técnicas hacen uso de una gran capacidad de cálculo para generar múltiples circuitos mediante ensayo y error, reduciendo la posibilidad de caer en algún mínimo local. Los resultados confirman que, en efecto, los circuitos generados mediante técnicas evolutivas son ligeramente mejores en prestaciones que con las técnicas aquí propuestas, pero con un coste computacional mucho mayor. En definitiva, se proponen varias técnicas originales de mitigación de fallos mediante circuitos aproximados. Se demuestra que estas técnicas tienen diversas aplicaciones, haciendo de la flexibilidad y adaptabilidad a los requisitos de cada aplicación sus principales virtudes.Programa Oficial de Doctorado en Ingeniería Eléctrica, Electrónica y AutomáticaPresidente: Raoul Velazco.- Secretario: Almudena Lindoso Muñoz.- Vocal: Jaume Segura Fuste

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Advanced information processing system: The Army fault tolerant architecture conceptual study. Volume 1: Army fault tolerant architecture overview

Author: Alger L. S.
Babikyan C. A.
Butler B. P.
Friend S. A.
Ganska R. J.
Harper R. E.
Lala J. H.
Masotto T. K.
Meyer A. J.
Morton D. P.
Publication venue
Publication date
Field of study

Digital computing systems needed for Army programs such as the Computer-Aided Low Altitude Helicopter Flight Program and the Armored Systems Modernization (ASM) vehicles may be characterized by high computational throughput and input/output bandwidth, hard real-time response, high reliability and availability, and maintainability, testability, and producibility requirements. In addition, such a system should be affordable to produce, procure, maintain, and upgrade. To address these needs, the Army Fault Tolerant Architecture (AFTA) is being designed and constructed under a three-year program comprised of a conceptual study, detailed design and fabrication, and demonstration and validation phases. Described here are the results of the conceptual study phase of the AFTA development. Given here is an introduction to the AFTA program, its objectives, and key elements of its technical approach. A format is designed for representing mission requirements in a manner suitable for first order AFTA sizing and analysis, followed by a discussion of the current state of mission requirements acquisition for the targeted Army missions. An overview is given of AFTA's architectural theory of operation

NASA Technical Reports Server

Fault-Tolerant Computing: An Overview

Author: Banerjee P.
Fuchs W.K.
Horst R.
Iyer R.K.
Patel J.H.
Publication venue: Center for Reliable and High-Performance Computing, Coordinated Science Laboratory, University of Illinois at Urbana-Champaign
Publication date: 01/06/1991
Field of study

Coordinated Science Laboratory was formerly known as Control Systems LaboratoryNASA / NAG-1-613Semiconductor Research Corporation / 90-DP-109Joint Services Electronics Program / N00014-90-J-127

Illinois Digital Environment for Access to Learning and Scholarship Repository

Error Mitigation Using Approximate Logic Circuits: A Comparison of Probabilistic and Evolutionary Approaches

Author: Entrena Arrontes Luis Alfonso
Hrbacek Radek
Sekanina Lukas
Sánchez Clemente Antonio José
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Technology scaling poses an increasing challenge to the reliability of digital circuits. Hardware redundancy solutions, such as triple modular redundancy (TMR), produce very high area overhead, so partial redundancy is often used to reduce the overheads. Approximate logic circuits provide a general framework for optimized mitigation of errors arising from a broad class of failure mechanisms, including transient, intermittent, and permanent failures. However, generating an optimal redundant logic circuit that is able to mask the faults with the highest probability while minimizing the area overheads is a challenging problem. In this study, we propose and compare two new approaches to generate approximate logic circuits to be used in a TMR schema. The probabilistic approach approximates a circuit in a greedy manner based on a probabilistic estimation of the error. The evolutionary approach can provide radically different solutions that are hard to reach by other methods. By combining these two approaches, the solution space can be explored in depth. Experimental results demonstrate that the evolutionary approach can produce better solutions, but the probabilistic approach is close. On the other hand, these approaches provide much better scalability than other existing partial redundancy techniques.This work was supported by the Ministry of Economy and Competitiveness of Spain under project ESP2015-68245-C4-1-P, and by the Czech science foundation project GA16-17538S and the Ministry of Education, Youth and Sports of the Czech Republic from the National Programme of Sustainability (NPU II); project IT4Innovations excellence in science - LQ1602

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Recommended from our members

Fault-tolerant hardware designs and their reliability analysis

Author: Hafezparast Mahmoud
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/1990
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Fault-tolerance, which is a complement to fault prevention, is an effective method of achieving ultra-high reliability. By taking this approach fault free computation can be achieved despite the presence of fault in the system. In this thesis three new fault tolerant techniques are presented and their advantages over well known fault-tolerant strategies are shown. One of these new techniques achieves higher reliability than any other similar techniques presented in the literature. Generally fault-tolerant structures consist of four major blocks: the replicated modules, the disagreement and detection circuit, the switching circuit, and the voting mechanism. The most critical component in a fault-tolerant system is the voter because the final output of the system is computed by this component. This dissertation presents a new implementation for voters which reduces both the complexity and the occupied area on the chip. The structures of the three techniques developed in this work are such that the complexity of their switching mechanisms grows only linearly with the number of modules but the voting mechanism complexity increases significantly. This is a better approach than those schemes in which the switching complexity increases significantly and the voter's complexity remains constant or grows linearly with the number of modules because it is easier to implement a complex voter than a complex switch (voters have more regular structures). Extensive comparisons are made between different fault-tolerant techniques. A new reliability model is also developed for system reliability evaluation of the new designs. The results of these analyses are plotted, and the advantages of the new techniques are demonstrated. In the final part of the work an expert system is described which uses the knowledge acquired by these comparisons. This expert system is meant as a prototype of a component of a CAD tool which will act as an advisor on fault-tolerant techniques

Brunel University Research Archive

Dynamic assertion testing of flight control software

Author: Andrews D. M.
Mahmood A.
Mccluskey E. J.
Publication venue
Publication date
Field of study

Digital Flight Control System (DFCS) software was used as a test case for assertion testing. The assertions were written and embedded in the code, then errors were inserted (seeded) one at a time and the code executed. Results indicate that assertion testing is an effective and efficient method of detecting errors in flight software. Most errors are eliminate at an earlier stage in the development than before

NASA Technical Reports Server

Dynamic assertion testing of flight control software

Author: Andrews D. M.
Mahmood A.
Mccluskey E. J.
Publication venue
Publication date
Field of study

An experiment in using assertions to dynamically test fault tolerant flight software is described. The experiment showed that 87% of typical errors introduced into the program would be detected by assertions. Detailed analysis of the test data showed that the number of assertions needed to detect those errors could be reduced to a minimal set. The analysis also revealed that the most effective assertions tested program parameters that provided greater indirect (collateral) testing of other parameters

NASA Technical Reports Server

Single event upset hardened embedded domain specific reconfigurable architecture

Author: Baloch Sajid
Publication venue: The University of Edinburgh
Publication date: 01/01/2007
Field of study

Edinburgh Research Archive

Development of a flight software testing methodology

Author: Andrews D. M.
Mccluskey E. J.
Publication venue
Publication date
Field of study

The research to develop a testing methodology for flight software is described. An experiment was conducted in using assertions to dynamically test digital flight control software. The experiment showed that 87% of typical errors introduced into the program would be detected by assertions. Detailed analysis of the test data showed that the number of assertions needed to detect those errors could be reduced to a minimal set. The analysis also revealed that the most effective assertions tested program parameters that provided greater indirect (collateral) testing of other parameters. In addition, a prototype watchdog task system was built to evaluate the effectiveness of executing assertions in parallel by using the multitasking features of Ada

NASA Technical Reports Server

Fault-tolerant building-block computer study

Author: Rennels D. A.
Publication venue
Publication date
Field of study

Ultra-reliable core computers are required for improving the reliability of complex military systems. Such computers can provide reliable fault diagnosis, failure circumvention, and, in some cases serve as an automated repairman for their host systems. A small set of building-block circuits which can be implemented as single very large integration devices, and which can be used with off-the-shelf microprocessors and memories to build self checking computer modules (SCCM) is described. Each SCCM is a microcomputer which is capable of detecting its own faults during normal operation and is described to communicate with other identical modules over one or more Mil Standard 1553A buses. Several SCCMs can be connected into a network with backup spares to provide fault-tolerant operation, i.e. automated recovery from faults. Alternative fault-tolerant SCCM configurations are discussed along with the cost and reliability associated with their implementation

NASA Technical Reports Server