263 research outputs found

    Towards an embedded board-level tester: study of a configurable test processor

    Get PDF
    The demand for electronic systems with more features, higher performance, and less power consumption increases continuously. This is a real challenge for design and test engineers because they have to deal with electronic systems with ever-increasing complexity maintaining production and test costs low and meeting critical time to market deadlines. For a test engineer working at the board-level, this means that manufacturing defects must be detected as soon as possible and at a low cost. However, the use of classical test techniques for testing modern printed circuit boards is not sufficient, and in the worst case these techniques cannot be used at all. This is mainly due to modern packaging technologies, a high device density, and high operation frequencies of modern printed circuit boards. This leads to very long test times, low fault coverage, and high test costs. This dissertation addresses these issues and proposes an FPGA-based test approach for printed circuit boards. The concept is based on a configurable test processor that is temporarily implemented in the on-board FPGA and provides the corresponding mechanisms to communicate to external test equipment and co-processors implemented in the FPGA. This embedded test approach provides the flexibility to implement test functions either in the external test equipment or in the FPGA. In this manner, tests are executed at-speed increasing the fault coverage, test times are reduced, and the test system can be adapted automatically to the properties of the FPGA and devices located on the board. An essential part of the FPGA-based test approach deals with the development of a test processor. In this dissertation the required properties of the processor are discussed, and it is shown that the adaptation to the specific test scenario plays a very important role for the optimization. For this purpose, the test processor is equipped with configuration parameters at the instruction set architecture and microarchitecture level. Additionally, an automatic generation process for the test system and for the computation of some of the processor’s configuration parameters is proposed. The automatic generation process uses as input a model known as the device under test model (DUT-M). In order to evaluate the entire FPGA-based test approach and the viability of a processor for testing printed circuit boards, the developed test system is used to test interconnections to two different devices: a static random memory (SRAM) and a liquid crystal display (LCD). Experiments were conducted in order to determine the resource utilization of the processor and FPGA-based test system and to measure test time when different test functions are implemented in the external test equipment or the FPGA. It has been shown that the introduced approach is suitable to test printed circuit boards and that the test processor represents a realistic alternative for testing at board-level.Der Bedarf an elektronischen Systemen mit zusätzlichen Merkmalen, höherer Leistung und geringerem Energieverbrauch nimmt ständig zu. Dies stellt eine erhebliche Herausforderung für Entwicklungs- und Testingenieure dar, weil sie sich mit elektronischen Systemen mit einer steigenden Komplexität zu befassen haben. Außerdem müssen die Herstellungs- und Testkosten gering bleiben und die Produkteinführungsfristen so kurz wie möglich gehalten werden. Daraus folgt, dass ein Testingenieur, der auf Leiterplatten-Ebene arbeitet, die Herstellungsfehler so früh wie möglich entdecken und dabei möglichst niedrige Kosten verursachen soll. Allerdings sind die klassischen Testmethoden nicht in der Lage, die Anforderungen von modernen Leiterplatten zu erfüllen und im schlimmsten Fall können diese Testmethoden überhaupt nicht verwendet werden. Dies liegt vor allem an modernen Gehäuse-Technologien, der hohen Bauteildichte und den hohen Arbeitsfrequenzen von modernen Leiterplatten. Das führt zu sehr langen Testzeiten, geringer Testabdeckung und hohen Testkosten. Die Dissertation greift diese Problematik auf und liefert einen FPGA-basierten Testansatz für Leiterplatten. Das Konzept beruht auf einem konfigurierbaren Testprozessor, welcher im On-Board-FPGA temporär implementiert wird und die entsprechenden Mechanismen für die Kommunikation mit der externen Testeinrichtung und Co-Prozessoren im FPGA bereitstellt. Dadurch ist es möglich Testfunktionen flexibel entweder auf der externen Testeinrichtung oder auf dem FPGA zu implementieren. Auf diese Weise werden Tests at-speed ausgeführt, um die Testabdeckung zu erhöhen. Außerdem wird die Testzeit verkürzt und das Testsystem automatisch an die Eigenschaften des FPGAs und anderer Bauteile auf der Leiterplatte angepasst. Ein wesentlicher Teil des FPGA-basierten Testansatzes umfasst die Entwicklung eines Testprozessors. In dieser Dissertation wird über die benötigten Eigenschaften des Prozessors diskutiert und es wird gezeigt, dass die Anpassung des Prozessors an den spezifischen Testfall von großer Bedeutung für die Optimierung ist. Zu diesem Zweck wird der Prozessor mit Konfigurationsparametern auf der Befehlssatzarchitektur-Ebene und Mikroarchitektur-Ebene ausgerüstet. Außerdem wird ein automatischer Generierungsprozess für die Realisierung des Testsystems und für die Berechnung einer Untergruppe von Konfigurationsparametern des Prozessors vorgestellt. Der automatische Generierungsprozess benutzt als Eingangsinformation ein Modell des Prüflings (device under test model, DUT-M). Das entwickelte Testsystem wurde zum Testen von Leiterplatten für Verbindungen zwischen dem FPGA und zwei Bauteilen verwendet, um den FPGA-basierten Testansatz und die Durchführbarkeit des Testprozessors für das Testen auf Leiterplatte-Ebene zu evaluieren. Die zwei Bauteile sind ein Speicher mit direktem Zugriff (static random-access memory, SRAM) und eine Flüssigkristallanzeige (liquid crystal display, LCD). Die Experimente wurden durchgeführt, um den Ressourcenverbrauch des Prozessors und Testsystems festzustellen und um die Testzeit zu messen. Dies geschah durch die Implementierung von unterschiedlichen Testfunktionen auf der externen Testeinrichtung und dem FPGA. Dadurch konnte gezeigt werden, dass der FPGA-basierte Ansatz für das Testen von Leiterplatten geeignet ist und dass der Testprozessor eine realistische Alternative für das Testen auf Leiterplatten-Ebene ist

    Error Detection and Diagnosis for System-on-Chip in Space Applications

    Get PDF
    Tesis por compendio de publicacionesLos componentes electrónicos comerciales, comúnmente llamados componentes Commercial-Off-The-Shelf (COTS) están presentes en multitud de dispositivos habituales en nuestro día a día. Particularmente, el uso de microprocesadores y sistemas en chip (SoC) altamente integrados ha favorecido la aparición de dispositivos electrónicos cada vez más inteligentes que sostienen el estilo de vida y el avance de la sociedad moderna. Su uso se ha generalizado incluso en aquellos sistemas que se consideran críticos para la seguridad, como vehículos, aviones, armamento, dispositivos médicos, implantes o centrales eléctricas. En cualquiera de ellos, un fallo podría tener graves consecuencias humanas o económicas. Sin embargo, todos los sistemas electrónicos conviven constantemente con factores internos y externos que pueden provocar fallos en su funcionamiento. La capacidad de un sistema para funcionar correctamente en presencia de fallos se denomina tolerancia a fallos, y es un requisito en el diseño y operación de sistemas críticos. Los vehículos espaciales como satélites o naves espaciales también hacen uso de microprocesadores para operar de forma autónoma o semi autónoma durante su vida útil, con la dificultad añadida de que no pueden ser reparados en órbita, por lo que se consideran sistemas críticos. Además, las duras condiciones existentes en el espacio, y en particular los efectos de la radiación, suponen un gran desafío para el correcto funcionamiento de los dispositivos electrónicos. Concretamente, los fallos transitorios provocados por radiación (conocidos como soft errors) tienen el potencial de ser una de las mayores amenazas para la fiabilidad de un sistema en el espacio. Las misiones espaciales de gran envergadura, típicamente financiadas públicamente como en el caso de la NASA o la Agencia Espacial Europea (ESA), han tenido históricamente como requisito evitar el riesgo a toda costa por encima de cualquier restricción de coste o plazo. Por ello, la selección de componentes resistentes a la radiación (rad-hard) específicamente diseñados para su uso en el espacio ha sido la metodología imperante en el paradigma que hoy podemos denominar industria espacial tradicional, u Old Space. Sin embargo, los componentes rad-hard tienen habitualmente un coste mucho más alto y unas prestaciones mucho menores que otros componentes COTS equivalentes. De hecho, los componentes COTS ya han sido utilizados satisfactoriamente en misiones de la NASA o la ESA cuando las prestaciones requeridas por la misión no podían ser cubiertas por ningún componente rad-hard existente. En los últimos años, el acceso al espacio se está facilitando debido en gran parte a la entrada de empresas privadas en la industria espacial. Estas empresas no siempre buscan evitar el riesgo a toda costa, sino que deben perseguir una rentabilidad económica, por lo que hacen un balance entre riesgo, coste y plazo mediante gestión del riesgo en un paradigma denominado Nuevo Espacio o New Space. Estas empresas a menudo están interesadas en entregar servicios basados en el espacio con las máximas prestaciones y el mayor beneficio posibles, para lo cual los componentes rad-hard son menos atractivos debido a su mayor coste y menores prestaciones que los componentes COTS existentes. Sin embargo, los componentes COTS no han sido específicamente diseñados para su uso en el espacio y típicamente no incluyen técnicas específicas para evitar que los efectos de la radiación afecten su funcionamiento. Los componentes COTS se comercializan tal cual son, y habitualmente no es posible modificarlos para mejorar su resistencia a la radiación. Además, los elevados niveles de integración de los sistemas en chip (SoC) complejos de altas prestaciones dificultan su observación y la aplicación de técnicas de tolerancia a fallos. Este problema es especialmente relevante en el caso de los microprocesadores. Por tanto, existe un gran interés en el desarrollo de técnicas que permitan conocer y mejorar el comportamiento de los microprocesadores COTS bajo radiación sin modificar su arquitectura y sin interferir en su funcionamiento para facilitar su uso en el espacio y con ello maximizar las prestaciones de las misiones espaciales presentes y futuras. En esta Tesis se han desarrollado técnicas novedosas para detectar, diagnosticar y mitigar los errores producidos por radiación en microprocesadores y sistemas en chip (SoC) comerciales, utilizando la interfaz de traza como punto de observación. La interfaz de traza es un recurso habitual en los microprocesadores modernos, principalmente enfocado a soportar las tareas de desarrollo y depuración del software durante la fase de diseño. Sin embargo, una vez el desarrollo ha concluido, la interfaz de traza típicamente no se utiliza durante la fase operativa del sistema, por lo que puede ser reutilizada sin coste. La interfaz de traza constituye un punto de conexión viable para observar el comportamiento de un microprocesador de forma no intrusiva y sin interferir en su funcionamiento. Como resultado de esta Tesis se ha desarrollado un módulo IP capaz de recabar y decodificar la información de traza de un microprocesador COTS moderno de altas prestaciones. El IP es altamente configurable y personalizable para adaptarse a diferentes aplicaciones y tipos de procesadores. Ha sido diseñado y validado utilizando el dispositivo Zynq-7000 de Xilinx como plataforma de desarrollo, que constituye un dispositivo COTS de interés en la industria espacial. Este dispositivo incluye un procesador ARM Cortex-A9 de doble núcleo, que es representativo del conjunto de microprocesadores hard-core modernos de altas prestaciones. El IP resultante es compatible con la tecnología ARM CoreSight, que proporciona acceso a información de traza en los microprocesadores ARM. El IP incorpora técnicas para detectar errores en el flujo de ejecución y en los datos de la aplicación ejecutada utilizando la información de traza, en tiempo real y con muy baja latencia. El IP se ha validado en campañas de inyección de fallos y también en radiación con protones y neutrones en instalaciones especializadas. También se ha combinado con otras técnicas de tolerancia a fallos para construir técnicas híbridas de mitigación de errores. Los resultados experimentales obtenidos demuestran su alta capacidad de detección y potencialidad en el diagnóstico de errores producidos por radiación. El resultado de esta Tesis, desarrollada en el marco de un Doctorado Industrial entre la Universidad Carlos III de Madrid (UC3M) y la empresa Arquimea, se ha transferido satisfactoriamente al entorno empresarial en forma de un proyecto financiado por la Agencia Espacial Europea para continuar su desarrollo y posterior explotación.Commercial electronic components, also known as Commercial-Off-The-Shelf (COTS), are present in a wide variety of devices commonly used in our daily life. Particularly, the use of microprocessors and highly integrated System-on-Chip (SoC) devices has fostered the advent of increasingly intelligent electronic devices which sustain the lifestyles and the progress of modern society. Microprocessors are present even in safety-critical systems, such as vehicles, planes, weapons, medical devices, implants, or power plants. In any of these cases, a fault could involve severe human or economic consequences. However, every electronic system deals continuously with internal and external factors that could provoke faults in its operation. The capacity of a system to operate correctly in presence of faults is known as fault-tolerance, and it becomes a requirement in the design and operation of critical systems. Space vehicles such as satellites or spacecraft also incorporate microprocessors to operate autonomously or semi-autonomously during their service life, with the additional difficulty that they cannot be repaired once in-orbit, so they are considered critical systems. In addition, the harsh conditions in space, and specifically radiation effects, involve a big challenge for the correct operation of electronic devices. In particular, radiation-induced soft errors have the potential to become one of the major risks for the reliability of systems in space. Large space missions, typically publicly funded as in the case of NASA or European Space Agency (ESA), have followed historically the requirement to avoid the risk at any expense, regardless of any cost or schedule restriction. Because of that, the selection of radiation-resistant components (known as rad-hard) specifically designed to be used in space has been the dominant methodology in the paradigm of traditional space industry, also known as “Old Space”. However, rad-hard components have commonly a much higher associated cost and much lower performance that other equivalent COTS devices. In fact, COTS components have already been used successfully by NASA and ESA in missions that requested such high performance that could not be satisfied by any available rad-hard component. In the recent years, the access to space is being facilitated in part due to the irruption of private companies in the space industry. Such companies do not always seek to avoid the risk at any cost, but they must pursue profitability, so they perform a trade-off between risk, cost, and schedule through risk management in a paradigm known as “New Space”. Private companies are often interested in deliver space-based services with the maximum performance and maximum benefit as possible. With such objective, rad-hard components are less attractive than COTS due to their higher cost and lower performance. However, COTS components have not been specifically designed to be used in space and typically they do not include specific techniques to avoid or mitigate the radiation effects in their operation. COTS components are commercialized “as is”, so it is not possible to modify them to improve their susceptibility to radiation effects. Moreover, the high levels of integration of complex, high-performance SoC devices hinder their observability and the application of fault-tolerance techniques. This problem is especially relevant in the case of microprocessors. Thus, there is a growing interest in the development of techniques allowing to understand and improve the behavior of COTS microprocessors under radiation without modifying their architecture and without interfering with their operation. Such techniques may facilitate the use of COTS components in space and maximize the performance of present and future space missions. In this Thesis, novel techniques have been developed to detect, diagnose, and mitigate radiation-induced errors in COTS microprocessors and SoCs using the trace interface as an observation point. The trace interface is a resource commonly found in modern microprocessors, mainly intended to support software development and debugging activities during the design phase. However, it is commonly left unused during the operational phase of the system, so it can be reused with no cost. The trace interface constitutes a feasible connection point to observe microprocessor behavior in a non-intrusive manner and without disturbing processor operation. As a result of this Thesis, an IP module has been developed capable to gather and decode the trace information of a modern, high-end, COTS microprocessor. The IP is highly configurable and customizable to support different applications and processor types. The IP has been designed and validated using the Xilinx Zynq-7000 device as a development platform, which is an interesting COTS device for the space industry. This device features a dual-core ARM Cortex-A9 processor, which is a good representative of modern, high-end, hard-core microprocessors. The resulting IP is compatible with the ARM CoreSight technology, which enables access to trace information in ARM microprocessors. The IP is able to detect errors in the execution flow of the microprocessor and in the application data using trace information, in real time and with very low latency. The IP has been validated in fault injection campaigns and also under proton and neutron irradiation campaigns in specialized facilities. It has also been combined with other fault-tolerance techniques to build hybrid error mitigation approaches. Experimental results demonstrate its high detection capabilities and high potential for the diagnosis of radiation-induced errors. The result of this Thesis, developed in the framework of an Industrial Ph.D. between the University Carlos III of Madrid (UC3M) and the company Arquimea, has been successfully transferred to the company business as a project sponsored by European Space Agency to continue its development and subsequent commercialization.Programa de Doctorado en Ingeniería Eléctrica, Electrónica y Automática por la Universidad Carlos III de MadridPresidenta: María Luisa López Vallejo.- Secretario: Enrique San Millán Heredia.- Vocal: Luigi Di Lill

    Resilience of an embedded architecture using hardware redundancy

    Get PDF
    In the last decade the dominance of the general computing systems market has being replaced by embedded systems with billions of units manufactured every year. Embedded systems appear in contexts where continuous operation is of utmost importance and failure can be profound. Nowadays, radiation poses a serious threat to the reliable operation of safety-critical systems. Fault avoidance techniques, such as radiation hardening, have been commonly used in space applications. However, these components are expensive, lag behind commercial components with regards to performance and do not provide 100% fault elimination. Without fault tolerant mechanisms, many of these faults can become errors at the application or system level, which in turn, can result in catastrophic failures. In this work we study the concepts of fault tolerance and dependability and extend these concepts providing our own definition of resilience. We analyse the physics of radiation-induced faults, the damage mechanisms of particles and the process that leads to computing failures. We provide extensive taxonomies of 1) existing fault tolerant techniques and of 2) the effects of radiation in state-of-the-art electronics, analysing and comparing their characteristics. We propose a detailed model of faults and provide a classification of the different types of faults at various levels. We introduce an algorithm of fault tolerance and define the system states and actions necessary to implement it. We introduce novel hardware and system software techniques that provide a more efficient combination of reliability, performance and power consumption than existing techniques. We propose a new element of the system called syndrome that is the core of a resilient architecture whose software and hardware can adapt to reliable and unreliable environments. We implement a software simulator and disassembler and introduce a testing framework in combination with ERA’s assembler and commercial hardware simulators

    Machine learning support for logic diagnosis

    Get PDF

    NASA Space Engineering Research Center Symposium on VLSI Design

    Get PDF
    The NASA Space Engineering Research Center (SERC) is proud to offer, at its second symposium on VLSI design, presentations by an outstanding set of individuals from national laboratories and the electronics industry. These featured speakers share insights into next generation advances that will serve as a basis for future VLSI design. Questions of reliability in the space environment along with new directions in CAD and design are addressed by the featured speakers

    The Fifth NASA Symposium on VLSI Design

    Get PDF
    The fifth annual NASA Symposium on VLSI Design had 13 sessions including Radiation Effects, Architectures, Mixed Signal, Design Techniques, Fault Testing, Synthesis, Signal Processing, and other Featured Presentations. The symposium provides insights into developments in VLSI and digital systems which can be used to increase data systems performance. The presentations share insights into next generation advances that will serve as a basis for future VLSI design

    Multi-level simulation of nano-electronic digital circuits on GPUs

    Get PDF
    Simulation of circuits and faults is an essential part in design and test validation tasks of contemporary nano-electronic digital integrated CMOS circuits. Shrinking technology processes with smaller feature sizes and strict performance and reliability requirements demand not only detailed validation of the functional properties of a design, but also accurate validation of non-functional aspects including the timing behavior. However, due to the rising complexity of the circuit behavior and the steady growth of the designs with respect to the transistor count, timing-accurate simulation of current designs requires a lot of computational effort which can only be handled by proper abstraction and a high degree of parallelization. This work presents a simulation model for scalable and accurate timing simulation of digital circuits on data-parallel graphics processing unit (GPU) accelerators. By providing compact modeling and data-structures as well as through exploiting multiple dimensions of parallelism, the simulation model enables not only fast and timing-accurate simulation at logic level, but also massively-parallel simulation with switch level accuracy. The model facilitates extensions for fast and efficient fault simulation of small delay faults at logic level, as well as first-order parametric and parasitic faults at switch level. With the parallelization on GPUs, detailed and scalable simulation is enabled that is applicable even to multi-million gate designs. This way, comprehensive analyses of realistic timing-related faults in presence of process- and parameter variations are enabled for the first time. Additional simulation efficiency is achieved by merging the presented methods in a unified simulation model, that allows to combine the unique advantages of the different levels of abstraction in a mixed-abstraction multi-level simulation flow to reach even higher speedups. Experimental results show that the implemented parallel approach achieves unprecedented simulation throughput as well as high speedup compared to conventional timing simulators. The underlying model scales for multi-million gate designs and gives detailed insights into the timing behavior of digital CMOS circuits, thereby enabling large-scale applications to aid even highly complex design and test validation tasks

    Embedded System Design

    Get PDF
    A unique feature of this open access textbook is to provide a comprehensive introduction to the fundamental knowledge in embedded systems, with applications in cyber-physical systems and the Internet of things. It starts with an introduction to the field and a survey of specification models and languages for embedded and cyber-physical systems. It provides a brief overview of hardware devices used for such systems and presents the essentials of system software for embedded systems, including real-time operating systems. The author also discusses evaluation and validation techniques for embedded systems and provides an overview of techniques for mapping applications to execution platforms, including multi-core platforms. Embedded systems have to operate under tight constraints and, hence, the book also contains a selected set of optimization techniques, including software optimization techniques. The book closes with a brief survey on testing. This fourth edition has been updated and revised to reflect new trends and technologies, such as the importance of cyber-physical systems (CPS) and the Internet of things (IoT), the evolution of single-core processors to multi-core processors, and the increased importance of energy efficiency and thermal issues

    Embedded System Design

    Get PDF
    A unique feature of this open access textbook is to provide a comprehensive introduction to the fundamental knowledge in embedded systems, with applications in cyber-physical systems and the Internet of things. It starts with an introduction to the field and a survey of specification models and languages for embedded and cyber-physical systems. It provides a brief overview of hardware devices used for such systems and presents the essentials of system software for embedded systems, including real-time operating systems. The author also discusses evaluation and validation techniques for embedded systems and provides an overview of techniques for mapping applications to execution platforms, including multi-core platforms. Embedded systems have to operate under tight constraints and, hence, the book also contains a selected set of optimization techniques, including software optimization techniques. The book closes with a brief survey on testing. This fourth edition has been updated and revised to reflect new trends and technologies, such as the importance of cyber-physical systems (CPS) and the Internet of things (IoT), the evolution of single-core processors to multi-core processors, and the increased importance of energy efficiency and thermal issues
    corecore