974 research outputs found
Using Fine Grain Approaches for highly reliable Design of FPGA-based Systems in Space
Nowadays using SRAM based FPGAs in space missions is increasingly considered due to their flexibility and reprogrammability. A challenge is the devices sensitivity to radiation effects that increased with modern architectures due to smaller CMOS structures. This work proposes fault tolerance methodologies, that are based on a fine grain view to modern reconfigurable architectures. The focus is on SEU mitigation challenges in SRAM based FPGAs which can result in crucial situations
Techniques for the realization of ultra- reliable spaceborne computer Final report
Bibliography and new techniques for use of error correction and redundancy to improve reliability of spaceborne computer
Synthesis and Optimization of Reversible Circuits - A Survey
Reversible logic circuits have been historically motivated by theoretical
research in low-power electronics as well as practical improvement of
bit-manipulation transforms in cryptography and computer graphics. Recently,
reversible circuits have attracted interest as components of quantum
algorithms, as well as in photonic and nano-computing technologies where some
switching devices offer no signal gain. Research in generating reversible logic
distinguishes between circuit synthesis, post-synthesis optimization, and
technology mapping. In this survey, we review algorithmic paradigms ---
search-based, cycle-based, transformation-based, and BDD-based --- as well as
specific algorithms for reversible synthesis, both exact and heuristic. We
conclude the survey by outlining key open challenges in synthesis of reversible
and quantum logic, as well as most common misconceptions.Comment: 34 pages, 15 figures, 2 table
Control Caching : a fault-tolerant architecture for SEU mitigation in microprocessor control logic
The importance of fault tolerance at the processor architecture level has been made increasingly important due to rapid advancements in the design and usage of high performance devices and embedded processors. System level solutions to the challenge of fault tolerance flag errors and utilize penalty cycles to recover through the re-execution of instructions. This motivates the need for a hybrid technique providing fault detection as well as fault masking, with minimal penalty cycles for recovery from detected errors. In this research, we propose Control Caching, an architectural technique comprising of three schemes to protect the control logic of microprocessors against Single Event Upsets (SEUs). High fault coverage with relatively low hardware overhead is obtained by using both fault detection with recovery and fault masking. Control signals are classified as either static or dynamic, and static signals are further classified as opcode dependent and instruction dependent. The strategy for protecting static instruction dependent control signals utilizes a distributed cache of the history of the control bits along with the Triple Modular Redundancy (TMR) concept, while the opcode dependent control signals are protected by a distributed cache which can be used to flag errors. Dynamic signals are protected by selective duplication of datapath components. The techniques are implemented on the OpenRISC 1200 processor. Our simulation results show that fault detection with single cycle recovery is provided for 92% of all instruction executions. FPGA synthesis is performed to analyze the associated cycle time and area overheads
Combined Time and Information Redundancy for SEU-Tolerance in Energy-Efficient Real-Time Systems
Recently the trade-off between energy consumption and fault-tolerance in real-time systems has been highlighted. These works have focused on dynamic voltage scaling (DVS) to reduce dynamic energy dissipation and on time redundancy to achieve transient-fault tolerance. While the time redundancy technique exploits the available slack time to increase the fault-tolerance by performing recovery executions, DVS exploits slack time to save energy. Therefore we believe there is a resource conflict between the time-redundancy technique and DVS. The first aim of this paper is to propose the usage of information redundancy to solve this problem. We demonstrate through analytical and experimental studies that it is possible to achieve both higher transient fault-tolerance (tolerance to single event upsets (SEU)) and less energy using a combination of information and time redundancy when compared with using time redundancy alone. The second aim of this paper is to analyze the interplay of transient-fault tolerance (SEU-tolerance) and adaptive body biasing (ABB) used to reduce static leakage energy, which has not been addressed in previous studies. We show that the same technique (i.e. the combination of time and information redundancy) is applicable to ABB-enabled systems and provides more advantages than time redundancy alone
New Fault Detection, Mitigation and Injection Strategies for Current and Forthcoming Challenges of HW Embedded Designs
Tesis por compendio[EN] Relevance of electronics towards safety of common devices has only been growing, as an ever growing stake of the functionality is assigned to them. But of course, this comes along the constant need for higher performances to fulfill such functionality requirements, while keeping power and budget low. In this scenario, industry is struggling to provide a technology which meets all the performance, power and price specifications, at the cost of an increased vulnerability to several types of known faults or the appearance of new ones.
To provide a solution for the new and growing faults in the systems, designers have been using traditional techniques from safety-critical applications, which offer in general suboptimal results. In fact, modern embedded architectures offer the possibility of optimizing the dependability properties by enabling the interaction of hardware, firmware and software levels in the process. However, that point is not yet successfully achieved. Advances in every level towards that direction are much needed if flexible, robust, resilient and cost effective fault tolerance is desired. The work presented here focuses on the hardware level, with the background consideration of a potential integration into a holistic approach.
The efforts in this thesis have focused several issues: (i) to introduce additional fault models as required for adequate representativity of physical effects blooming in modern manufacturing technologies, (ii) to provide tools and methods to efficiently inject both the proposed models and classical ones, (iii) to analyze the optimum method for assessing the robustness of the systems by using extensive fault injection and later correlation with higher level layers in an effort to cut development time and cost, (iv) to provide new detection methodologies to cope with challenges modeled by proposed fault models, (v) to propose mitigation strategies focused towards tackling such new threat scenarios and (vi) to devise an automated methodology for the deployment of many fault tolerance mechanisms in a systematic robust way.
The outcomes of the thesis constitute a suite of tools and methods to help the designer of critical systems in his task to develop robust, validated, and on-time designs tailored to his application.[ES] La relevancia que la electrónica adquiere en la seguridad de los productos ha crecido inexorablemente, puesto que cada vez ésta copa una mayor influencia en la funcionalidad de los mismos. Pero, por supuesto, este hecho viene acompañado de una necesidad constante de mayores prestaciones para cumplir con los requerimientos funcionales, al tiempo que se mantienen los costes y el consumo en unos niveles reducidos. En este escenario, la industria está realizando esfuerzos para proveer una tecnología que cumpla con todas las especificaciones de potencia, consumo y precio, a costa de un incremento en la vulnerabilidad a múltiples tipos de fallos conocidos o la introducción de nuevos.
Para ofrecer una solución a los fallos nuevos y crecientes en los sistemas, los diseñadores han recurrido a técnicas tradicionalmente asociadas a sistemas críticos para la seguridad, que ofrecen en general resultados sub-óptimos. De hecho, las arquitecturas empotradas modernas ofrecen la posibilidad de optimizar las propiedades de confiabilidad al habilitar la interacción de los niveles de hardware, firmware y software en el proceso. No obstante, ese punto no está resulto todavía. Se necesitan avances en todos los niveles en la mencionada dirección para poder alcanzar los objetivos de una tolerancia a fallos flexible, robusta, resiliente y a bajo coste. El trabajo presentado aquí se centra en el nivel de hardware, con la consideración de fondo de una potencial integración en una estrategia holística.
Los esfuerzos de esta tesis se han centrado en los siguientes aspectos: (i) la introducción de modelos de fallo adicionales requeridos para la representación adecuada de efectos físicos surgentes en las tecnologías de manufactura actuales, (ii) la provisión de herramientas y métodos para la inyección eficiente de los modelos propuestos y de los clásicos, (iii) el análisis del método óptimo para estudiar la robustez de sistemas mediante el uso de inyección de fallos extensiva, y la posterior correlación con capas de más alto nivel en un esfuerzo por recortar el tiempo y coste de desarrollo, (iv) la provisión de nuevos métodos de detección para cubrir los retos planteados por los modelos de fallo propuestos, (v) la propuesta de estrategias de mitigación enfocadas hacia el tratamiento de dichos escenarios de amenaza y (vi) la introducción de una metodología automatizada de despliegue de diversos mecanismos de tolerancia a fallos de forma robusta y sistemática.
Los resultados de la presente tesis constituyen un conjunto de herramientas y métodos para ayudar al diseñador de sistemas críticos en su tarea de desarrollo de diseños robustos, validados y en tiempo adaptados a su aplicación.[CA] La rellevància que l'electrònica adquireix en la seguretat dels productes ha crescut inexorablement, puix cada volta més aquesta abasta una major influència en la funcionalitat dels mateixos. Però, per descomptat, aquest fet ve acompanyat d'un constant necessitat de majors prestacions per acomplir els requeriments funcionals, mentre es mantenen els costos i consums en uns nivells reduïts. Donat aquest escenari, la indústria està fent esforços per proveir una tecnologia que complisca amb totes les especificacions de potència, consum i preu, tot a costa d'un increment en la vulnerabilitat a diversos tipus de fallades conegudes, i a la introducció de nous tipus.
Per oferir una solució a les noves i creixents fallades als sistemes, els dissenyadors han recorregut a tècniques tradicionalment associades a sistemes crítics per a la seguretat, que en general oferixen resultats sub-òptims. De fet, les arquitectures empotrades modernes oferixen la possibilitat d'optimitzar les propietats de confiabilitat en habilitar la interacció dels nivells de hardware, firmware i software en el procés. Tot i això eixe punt no està resolt encara. Es necessiten avanços a tots els nivells en l'esmentada direcció per poder assolir els objectius d'una tolerància a fallades flexible, robusta, resilient i a baix cost. El treball ací presentat se centra en el nivell de hardware, amb la consideració de fons d'una potencial integració en una estratègia holística.
Els esforços d'esta tesi s'han centrat en els següents aspectes: (i) la introducció de models de fallada addicionals requerits per a la representació adequada d'efectes físics que apareixen en les tecnologies de fabricació actuals, (ii) la provisió de ferramentes i mètodes per a la injecció eficient del models proposats i dels clàssics, (iii) l'anàlisi del mètode òptim per estudiar la robustesa de sistemes mitjançant l'ús d'injecció de fallades extensiva, i la posterior correlació amb capes de més alt nivell en un esforç per retallar el temps i cost de desenvolupament, (iv) la provisió de nous mètodes de detecció per cobrir els reptes plantejats pels models de fallades proposats, (v) la proposta d'estratègies de mitigació enfocades cap al tractament dels esmentats escenaris d'amenaça i (vi) la introducció d'una metodologia automatitzada de desplegament de diversos mecanismes de tolerància a fallades de forma robusta i sistemàtica.
Els resultats de la present tesi constitueixen un conjunt de ferramentes i mètodes per ajudar el dissenyador de sistemes crítics en la seua tasca de desenvolupament de dissenys robustos, validats i a temps adaptats a la seua aplicació.Espinosa García, J. (2016). New Fault Detection, Mitigation and Injection Strategies for Current and Forthcoming Challenges of HW Embedded Designs [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/73146TESISCompendi
Recommended from our members
IC design for reliability
textAs the feature size of integrated circuits goes down to the nanometer scale,
transient and permanent reliability issues are becoming a significant concern for circuit
designers. Traditionally, the reliability issues were mostly handled at the device level as a
device engineering problem. However, the increasing severity of reliability challenges
and higher error rates due to transient upsets favor higher-level design for reliability
(DFR). In this work, we develop several methods for DFR at the circuit level.
A major source of transient errors is the single event upset (SEU). SEUs are
caused by high-energy particles present in the cosmic rays or emitted by radioactive
contaminants in the chip packaging materials. When these particles hit a N+/P+ depletion
region of an MOS transistor, they may generate a temporary logic fault. Depending on
where the MOS transistor is located and what state the circuit is at, an SEU may result in
a circuit-level error. We analyze SEUs both in combinational logic and memories
(SRAM). For combinational logic circuit, we propose FASER, a Fast Analysis tool of
Soft ERror susceptibility for cell-based designs. The efficiency of FASER is achieved
through its static and vector-less nature. In order to evaluate the impact of SEU on SRAM, a theory for estimating dynamic noise margins is developed analytically. The
results allow predicting the transient error susceptibility of an SRAM cell using a closedform
expression.
Among the many permanent failure mechanisms that include time-dependent
oxide breakdown (TDDB), electro-migration (EM), hot carrier effect (HCE), and
negative bias temperature instability (NBTI), NBTI has recently become important.
Therefore, the main focus of our work is NBTI. NBTI occurs when the gate of PMOS is
negatively biased. The voltage stress across the gate generates interface traps, which
degrade the threshold voltage of PMOS. The degraded PMOS may eventually fail to meet
timing requirement and cause functional errors. NBTI becomes severe at elevated
temperatures. In this dissertation, we propose a NBTI degradation model that takes into
account the temperature variation on the chip and gives the accurate estimation of the
degraded threshold voltage.
In order to account for the degradation of devices, traditional design methods add
guard-bands to ensure that the circuit will function properly during its lifetime. However,
the worst-case based guard-bands lead to significant penalty in performance. In this
dissertation, we propose an effective macromodel-based reliability tracking and
management framework, based on a hybrid network of on-chip sensors, consisting of
temperature sensors and ring oscillators. The model is concerned specifically with NBTIinduced
transistor aging. The key feature of our work, in contrast to the traditional
tracking techniques that rely solely on direct measurement of the increase of threshold
voltage or circuit delay, is an explicit macromodel which maps operating temperature to
circuit degradation (the increase of circuit delay). The macromodel allows for costeffective
tracking of reliability using temperature sensors and is also essential for
enabling the control loop of the reliability management system. The developed methods improve the over-conservatism of the device-level, worstcase
reliability estimation techniques. As the severity of reliability challenges continue to
grow with technology scaling, it will become more important for circuit designers/CAD
tools to be equipped with the developed methods.Electrical and Computer Engineerin
- …