Search CORE

8 research outputs found

Low-cost and efficient fault detection and diagnosis schemes for modern cores

Author: Carretero Casado Javier Sebastian
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2015
Field of study

Continuous improvements in transistor scaling together with microarchitectural advances have made possible the widespread adoption of high-performance processors across all market segments. However, the growing reliability threats induced by technology scaling and by the complexity of designs are challenging the production of cheap yet robust systems. Soft error trends are haunting, especially for combinational logic, and parity and ECC codes are therefore becoming insufficient as combinational logic turns into the dominant source of soft errors. Furthermore, experts are warning about the need to also address intermittent and permanent faults during processor runtime, as increasing temperatures and device variations will accelerate inherent aging phenomena. These challenges specially threaten the commodity segments, which impose requirements that existing fault tolerance mechanisms cannot offer. Current techniques based on redundant execution were devised in a time when high penalties were assumed for the sake of high reliability levels. Novel light-weight techniques are therefore needed to enable fault protection in the mass market segments. The complexity of designs is making post-silicon validation extremely expensive. Validation costs exceed design costs, and the number of discovered bugs is growing, both during validation and once products hit the market. Fault localization and diagnosis are the biggest bottlenecks, magnified by huge detection latencies, limited internal observability, and costly server farms to generate test outputs. This thesis explores two directions to address some of the critical challenges introduced by unreliable technologies and by the limitations of current validation approaches. We first explore mechanisms for comprehensively detecting multiple sources of failures in modern processors during their lifetime (including transient, intermittent, permanent and also design bugs). Our solutions embrace a paradigm where fault tolerance is built based on exploiting high-level microarchitectural invariants that are reusable across designs, rather than relying on re-execution or ad-hoc block-level protection. To do so, we decompose the basic functionalities of processors into high-level tasks and propose three novel runtime verification solutions that combined enable global error detection: a computation/register dataflow checker, a memory dataflow checker, and a control flow checker. The techniques use the concept of end-to-end signatures and allow designers to adjust the fault coverage to their needs, by trading-off area, power and performance. Our fault injection studies reveal that our methods provide high coverage levels while causing significantly lower performance, power and area costs than existing techniques. Then, this thesis extends the applicability of the proposed error detection schemes to the validation phases. We present a fault localization and diagnosis solution for the memory dataflow by combining our error detection mechanism, a new low-cost logging mechanism and a diagnosis program. Selected internal activity is continuously traced and kept in a memory-resident log whose capacity can be expanded to suite validation needs. The solution can catch undiscovered bugs, reducing the dependence on simulation farms that compute golden outputs. Upon error detection, the diagnosis algorithm analyzes the log to automatically locate the bug, and also to determine its root cause. Our evaluations show that very high localization coverage and diagnosis accuracy can be obtained at very low performance and area costs. The net result is a simplification of current debugging practices, which are extremely manual, time consuming and cumbersome. Altogether, the integrated solutions proposed in this thesis capacitate the industry to deliver more reliable and correct processors as technology evolves into more complex designs and more vulnerable transistors.El continuo escalado de los transistores junto con los avances microarquitectónicos han posibilitado la presencia de potentes procesadores en todos los segmentos de mercado. Sin embargo, varios problemas de fiabilidad están desafiando la producción de sistemas robustos. Las predicciones de "soft errors" son inquietantes, especialmente para la lógica combinacional: soluciones como ECC o paridad se están volviendo insuficientes a medida que dicha lógica se convierte en la fuente predominante de soft errors. Además, los expertos están alertando acerca de la necesidad de detectar otras fuentes de fallos (causantes de errores permanentes e intermitentes) durante el tiempo de vida de los procesadores. Los segmentos "commodity" son los más vulnerables, ya que imponen unos requisitos que las técnicas actuales de fiabilidad no ofrecen. Estas soluciones (generalmente basadas en re-ejecución) fueron ideadas en un tiempo en el que con tal de alcanzar altos nivel de fiabilidad se asumían grandes costes. Son por tanto necesarias nuevas técnicas que permitan la protección contra fallos en los segmentos más populares. La complejidad de los diseños está encareciendo la validación "post-silicon". Su coste excede el de diseño, y el número de errores descubiertos está aumentando durante la validación y ya en manos de los clientes. La localización y el diagnóstico de errores son los mayores problemas, empeorados por las altas latencias en la manifestación de errores, por la poca observabilidad interna y por el coste de generar las señales esperadas. Esta tesis explora dos direcciones para tratar algunos de los retos causados por la creciente vulnerabilidad hardware y por las limitaciones de los enfoques de validación. Primero exploramos mecanismos para detectar múltiples fuentes de fallos durante el tiempo de vida de los procesadores (errores transitorios, intermitentes, permanentes y de diseño). Nuestras soluciones son de un paradigma donde la fiabilidad se construye explotando invariantes microarquitectónicos genéricos, en lugar de basarse en re-ejecución o en protección ad-hoc. Para ello descomponemos las funcionalidades básicas de un procesador y proponemos tres soluciones de `runtime verification' que combinadas permiten una detección de errores a nivel global. Estas tres soluciones son: un verificador de flujo de datos de registro y de computación, un verificador de flujo de datos de memoria y un verificador de flujo de control. Nuestras técnicas usan el concepto de firmas y permiten a los diseñadores ajustar los niveles de protección a sus necesidades, mediante compensaciones en área, consumo energético y rendimiento. Nuestros estudios de inyección de errores revelan que los métodos propuestos obtienen altos niveles de protección, a la vez que causan menos costes que las soluciones existentes. A continuación, esta tesis explora la aplicabilidad de estos esquemas a las fases de validación. Proponemos una solución de localización y diagnóstico de errores para el flujo de datos de memoria que combina nuestro mecanismo de detección de errores, junto con un mecanismo de logging de bajo coste y un programa de diagnóstico. Cierta actividad interna es continuamente registrada en una zona de memoria cuya capacidad puede ser expandida para satisfacer las necesidades de validación. La solución permite descubrir bugs, reduciendo la necesidad de calcular los resultados esperados. Al detectar un error, el algoritmo de diagnóstico analiza el registro para automáticamente localizar el bug y determinar su causa. Nuestros estudios muestran un alto grado de localización y de precisión de diagnóstico a un coste muy bajo de rendimiento y área. El resultado es una simplificación de las prácticas actuales de depuración, que son enormemente manuales, incómodas y largas. En conjunto, las soluciones de esta tesis capacitan a la industria a producir procesadores más fiables, a medida que la tecnología evoluciona hacia diseños más complejos y más vulnerables

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Advanced information processing system: The Army fault tolerant architecture conceptual study. Volume 2: Army fault tolerant architecture design and analysis

Author: Alger L. S.
Babikyan C. A.
Butler B. P.
Friend S. A.
Ganska R. J.
Harper R. E.
Lala J. H.
Masotto T. K.
Meyer A. J.
Morton D. P.
Publication venue
Publication date
Field of study

Described here is the Army Fault Tolerant Architecture (AFTA) hardware architecture and components and the operating system. The architectural and operational theory of the AFTA Fault Tolerant Data Bus is discussed. The test and maintenance strategy developed for use in fielded AFTA installations is presented. An approach to be used in reducing the probability of AFTA failure due to common mode faults is described. Analytical models for AFTA performance, reliability, availability, life cycle cost, weight, power, and volume are developed. An approach is presented for using VHSIC Hardware Description Language (VHDL) to describe and design AFTA's developmental hardware. A plan is described for verifying and validating key AFTA concepts during the Dem/Val phase. Analytical models and partial mission requirements are used to generate AFTA configurations for the TF/TA/NOE and Ground Vehicle missions

NASA Technical Reports Server

Designing Effective Logic Obfuscation: Exploring Beyond Gate-Level Boundaries

Author: Zuzak Michael Jeffrey
Publication venue
Publication date: 01/01/2022
Field of study

The need for high-end performance and cost savings has driven hardware design houses to outsource integrated circuit (IC) fabrication to untrusted manufacturing facilities. During fabrication, the entire chip design is exposed to these potentially malicious facilities, raising concerns of intellectual property (IP) piracy, reverse engineering, and counterfeiting. This is a major concern of both government and private organizations, especially in the context of military hardware. Logic obfuscation techniques have been proposed to prevent these supply-chain attacks. These techniques lock a chip by inserting additional key logic into combinational blocks of a circuit. The resulting design only exhibits correct functionality when a correct key is applied after fabrication. To date, the majority of obfuscation research centers on evaluating combinational constructions with gate-level criteria. However, this approach ignores critical high-level context, such as the interaction between modules and application error resilience. For this dissertation, we move beyond the traditional gate-level view of logic obfuscation, developing criteria and methodologies to design and evaluate obfuscated circuits for hardware-oriented security guarantees that transcend gate-level boundaries. To begin our work, we characterize the security of obfuscation when viewed in the context of a larger IC and consider how to effectively apply logic obfuscation for security beyond gate-level boundaries. We derive a fundamental trade-off underlying all logic obfuscation that is between security and attack resilience. We then develop an open-source, GEM5-based simulator called ObfusGEM, which evaluates logic obfuscation at the architecture/application-level in processor ICs. Using ObfusGEM, we perform an architectural design space exploration of logic obfuscation in processor ICs. This exploration indicates that current obfuscation schemes cannot simultaneously achieve security and attack resilience goals. Based on the lessons learned from this design space exploration, we explore 2 orthogonal approaches to design ICs with strong security guarantees beyond gate-level boundaries. For the first approach, we consider how logic obfuscation constructions can be modified to overcome the limitations identified in our design space exploration. This approach results in the development of 3 novel obfuscation techniques targeted towards securing 3 distinct applications. The first technique is Trace Logic Locking which enhances existing obfuscation techniques to provably expand the derived trade-off between security and attack resilience. The second technique is Memory Locking which defines an automatable approach to processor design obfuscation through locking the analog timing effects that govern the function of on-chip SRAM arrays. The third technique is High Error Rate Keys which protect probabilistic circuits against a SAT-based attacker by hiding the correct secret key value under stochastic noise. We demonstrate that all 3 techniques are capable of overcoming the limitations of obfuscation when viewed beyond gate-level boundaries in their respective applications. For the second approach, we consider how architectural design decisions can influence hardware security. We begin by exploring security-aware architecture design, an approach where minor architectural modifications are identified and applied to improve security in processor ICs. We then develop resource binding algorithms for high-level synthesis that optimally bind operations onto obfuscated functional units to amplify security guarantees. In both cases, we show that by designing logic obfuscation using architectural context a designer can secure ICs beyond gate-level boundaries despite the presence of the rigid trade-off that rendered prior obfuscation techniques insecure

Digital Repository at the University of Maryland

High level behavioural modelling of boundary scan architecture.

Author: Medhat Saad Sabih Ahmed
Publication venue
Publication date
Field of study

This project involves the development of a software tool which enables the integration of the IEEE 1149.1/JTAG Boundary Scan Test Architecture automatically into an ASIC (Application Specific Integrated Circuit) design. The tool requires the original design (the ASIC) to be described in VHDL-IEEE 1076 Hardware Description Language. The tool consists of the two major elements: i) A parsing and insertion algorithm developed and implemented in 'C'; ii) A high level model of the Boundary Scan Test Architecture implemented in 'VHDL'. The parsing and insertion algorithm is developed to deal with identifying the design Input/Output (I/O) terminals, their types and the order they appear in the ASIC design. It then attaches suitable Boundary Scan Cells to each I/O, except power and ground and inserts the high level models of the full Boundary Scan Architecture into the ASIC without altering the design core structure

Bournemouth University Research Online

Efficient SMT-based ATPG for interconnect open defects

Author
Publication venue: 'EDAA'
Publication date: 01/01/2014
Field of study

Crossref

Efficient SMT-based ATPG for interconnect open defects

Author
Publication venue: 'EDAA'
Publication date: 01/01/2014
Field of study

Crossref

Esprit '91. Proceedings of the annual Esprit conference. Brussels, 25-29 November 1991. EUR 13853 EN

Author
Publication venue
Publication date
Field of study