367 research outputs found

    Fault-based Analysis of Industrial Cyber-Physical Systems

    Get PDF
    The fourth industrial revolution called Industry 4.0 tries to bridge the gap between traditional Electronic Design Automation (EDA) technologies and the necessity of innovating in many indus- trial fields, e.g., automotive, avionic, and manufacturing. This complex digitalization process in- volves every industrial facility and comprises the transformation of methodologies, techniques, and tools to improve the efficiency of every industrial process. The enhancement of functional safety in Industry 4.0 applications needs to exploit the studies related to model-based and data-driven anal- yses of the deployed Industrial Cyber-Physical System (ICPS). Modeling an ICPS is possible at different abstraction levels, relying on the physical details included in the model and necessary to describe specific system behaviors. However, it is extremely complicated because an ICPS is com- posed of heterogeneous components related to different physical domains, e.g., digital, electrical, and mechanical. In addition, it is also necessary to consider not only nominal behaviors but even faulty behaviors to perform more specific analyses, e.g., predictive maintenance of specific assets. Nevertheless, these faulty data are usually not present or not available directly from the industrial machinery. To overcome these limitations, constructing a virtual model of an ICPS extended with different classes of faults enables the characterization of faulty behaviors of the system influenced by different faults. In literature, these topics are addressed with non-uniformly approaches and with the absence of standardized and automatic methodologies for describing and simulating faults in the different domains composing an ICPS. This thesis attempts to overcome these state-of-the-art gaps by proposing novel methodologies, techniques, and tools to: model and simulate analog and multi-domain systems; abstract low-level models to higher-level behavioral models; and monitor industrial systems based on the Industrial Internet of Things (IIOT) paradigm. Specifically, the proposed contributions involve the exten- sion of state-of-the-art fault injection practices to improve the ICPSs safety, the development of frameworks for safety operations automatization, and the definition of a monitoring framework for ICPSs. Overall, fault injection in analog and digital models is the state of the practice to en- sure functional safety, as mentioned in the ISO 26262 standard specific for the automotive field. Starting from state-of-the-art defects defined for analog descriptions, new defects are proposed to enhance the IEEE P2427 draft standard for analog defect modeling and coverage. Moreover, dif- ferent techniques to abstract a transistor-level model to a behavioral model are proposed to speed up the simulation of faulty circuits. Therefore, unlike the electrical domain, there is no extensive use of fault injection techniques in the mechanical one. Thus, extending the fault injection to the mechanical and thermal fields allows for supporting the definition and evaluation of more reliable safety mechanisms. Hence, a taxonomy of mechanical faults is derived from the electrical domain by exploiting the physical analogies. Furthermore, specific tools are built for automatically instru- menting different descriptions with multi-domain faults. The entire work is proposed as a basis for supporting the creation of increasingly resilient and secure ICPS that need to preserve functional safety in any operating context

    Moving Towards Analog Functional Safety

    Get PDF
    Over the past century, the exponential growth of the semiconductor industry has led to the creation of tiny and complex integrated circuits, e.g., sensors, actuators, and smart power systems. Innovative techniques are needed to ensure the correct functionality of analog devices that are ubiquitous in every smart system. The standard ISO 26262 related to functional safety in the automotive context specifies that fault injection is necessary to validate all electronic devices. For decades, standardizing fault modeling, injection and simulation mainly focused on digital circuits and disregarding analog ones. An initial attempt is being made with the IEEE P2427 standard draft standard that started to give this field a structured and formal organization. In this context, new fault models, injection, and abstraction methodologies for analog circuits are proposed in this thesis to enhance this application field. The faults proposed by the IEEE P2427 standard draft standard are initially evaluated to understand the associated fault behaviors during the simulation. Moreover, a novel approach is presented for modeling realistic stuck-on/off defects based on oxide defects. These new defects proposed are required because digital stuck-at-fault models where a transistor is frozen in on-state or offstate may not apply well on analog circuits because even a slight variation could create deviations of several magnitudes. Then, for validating the proposed defects models, a novel predictive fault grouping based on faulty AC matrices is applied to group faults with equivalent behaviors. The proposed fault grouping method is computationally cheap because it avoids performing DC or transient simulations with faults injected and limits itself to faulty AC simulations. Using AC simulations results in two different methods that allow grouping faults with the same frequency response are presented. The first method is an AC-based grouping method that exploits the potentialities of the S-parameters ports. While the second is a Circle-based grouping based on the circle-fitting method applied to the extracted AC matrices. Finally, an open-source framework is presented for the fault injection and manipulation perspective. This framework relies on the shared semantics for reading, writing, or manipulating transistor-level designs. The ultimate goal of the framework is: reading an input design written in a specific syntax and then allowing to write the same design in another syntax. As a use case for the proposed framework, a process of analog fault injection is discussed. This activity requires adding, removing, or replacing nodes, components, or even entire sub-circuits. The framework is entirely written in C++, and its APIs are also interfaced with Python. The entire framework is open-source and available on GitHub. The last part of the thesis presents abstraction methodologies that can abstract transistor level models into Verilog-AMS models and Verilog- AMS piecewise and nonlinear models into C++. These abstracted models can be integrated into heterogeneous systems. The purpose of integration is the simulation of heterogeneous components embedded in a Virtual Platforms (VP) needs to be fast and accurate

    A Survey of Fault-Injection Methodologies for Soft Error Rate Modeling in Systems-on-Chips

    Get PDF
    The development of process technology has increased system performance, but the system failure probability has also significantly increased. It is important to consider the system reliability in addition to the cost, performance, and power consumption. In this paper, we describe the types of faults that occur in a system and where these faults originate. Then, fault-injection techniques, which are used to characterize the fault rate of a system-on-chip (SoC), are investigated to provide a guideline to SoC designers for the realization of resilient SoCs

    Standart-konformes Snapshotting für SystemC Virtuelle Plattformen

    Get PDF
    The steady increase in complexity of high-end embedded systems goes along with an increasingly complex design process. We are currently still in a transition phase from Hardware-Description Language (HDL) based design towards virtual-platform-based design of embedded systems. As design complexity rises faster than developer productivity a gap forms. Restoring productivity while at the same time managing increased design complexity can also be achieved through focussing on the development of new tools and design methodologies. In most application areas, high-level modelling languages such as SystemC are used in early design phases. In modern software development Continuous Integration (CI) is used to automatically test if a submitted piece of code breaks functionality. Application of the CI concept to embedded system design and testing requires fast build and test execution times from the virtual platform framework. For this use case the ability to save a specific state of a virtual platform becomes necessary. The saving and restoring of specific states of a simulation requires the ability to serialize all data structures within the simulation models. Improving the frameworks and establishing better methods will only help to narrow the design gap, if these changes are introduced with the needs of the engineers and developers in mind. Ultimately, it is their productivity that shall be improved. The ability to save the state of a virtual platform enables developers to run longer test campaigns that can even contain randomized test stimuli. If the saved states are modifiable the developers can inject faulty states into the simulation models. This work contributes an extension to the SoCRocket virtual platform framework to enable snapshotting. The snapshotting extension can be considered a reference implementation as the utilization of current SystemC/TLM standards makes it compatible to other frameworkds. Furthermore, integrating the UVM SystemC library into the framework enables test driven development and fast validation of SystemC/TLM models using snapshots. These extensions narrow the design gap by supporting designers, testers and developers to work more efficiently.Die stetige Steigerung der Komplexität eingebetteter Systeme geht einher mit einer ebenso steigenden Komplexität des Entwurfsprozesses. Wir befinden uns momentan in der Übergangsphase vom Entwurf von eingebetteten Systemen basierend auf Hardware-Beschreibungssprachen hin zum Entwurf ebendieser basierend auf virtuellen Plattformen. Da die Entwurfskomplexität rasanter steigt als die Produktivität der Entwickler, entsteht eine Kluft. Die Produktivität wiederherzustellen und gleichzeitig die gesteigerte Entwurfskomplexität zu bewältigen, kann auch erreicht werden, indem der Fokus auf die Entwicklung neuer Werkzeuge und Entwurfsmethoden gelegt wird. In den meisten Anwendungsgebieten werden Modellierungssprachen auf hoher Ebene, wie zum Beispiel SystemC, in den frühen Entwurfsphasen benutzt. In der modernen Software-Entwicklung wird Continuous Integration (CI) benutzt um automatisiert zu überprüfen, ob eine eingespielte Änderung am Quelltext bestehende Funktionalitäten beeinträchtigt. Die Anwendung des CI-Konzepts auf den Entwurf und das Testen von eingebetteten Systemen fordert schnelle Bau- und Test-Ausführungszeiten von dem genutzten Framework für virtuelle Plattformen. Für diesen Anwendungsfall wird auch die Fähigkeit, einen bestimmten Zustand der virtuellen Plattform zu speichern, erforderlich. Das Speichern und Wiederherstellen der Zustände einer Simulation erfordert die Serialisierung aller Datenstrukturen, die sich in den Simulationsmodellen befinden. Das Verbessern von Frameworks und Etablieren besserer Methodiken hilft nur die Entwurfs-Kluft zu verringern, wenn diese Änderungen mit Berücksichtigung der Bedürfnisse der Entwickler und Ingenieure eingeführt werden. Letztendlich ist es ihre Produktivität, die gesteigert werden soll. Die Fähigkeit den Zustand einer virtuellen Plattform zu speichern, ermöglicht es den Entwicklern, längere Testkampagnen laufen zu lassen, die auch zufällig erzeugte Teststimuli beinhalten können oder, falls die gespeicherten Zustände modifizierbar sind, fehlerbehaftete Zustände in die Simulationsmodelle zu injizieren. Mein mit dieser Arbeit geleisteter Beitrag beinhaltet die Erweiterung des SoCRocket Frameworks um Checkpointing Funktionalität im Sinne einer Referenzimplementierung. Weiterhin ermöglicht die Integration der UVM SystemC Bibliothek in das Framework die Umsetzung der testgetriebenen Entwicklung und schnelle Validierung von SystemC/TLM Modellen mit Hilfe von Snapshots

    Dependability-driven Strategies to Improve the Design and Verification of Safety-Critical HDL-based Embedded Systems

    Full text link
    [ES] La utilización de sistemas empotrados en cada vez más ámbitos de aplicación está llevando a que su diseño deba enfrentarse a mayores requisitos de rendimiento, consumo de energía y área (PPA). Asimismo, su utilización en aplicaciones críticas provoca que deban cumplir con estrictos requisitos de confiabilidad para garantizar su correcto funcionamiento durante períodos prolongados de tiempo. En particular, el uso de dispositivos lógicos programables de tipo FPGA es un gran desafío desde la perspectiva de la confiabilidad, ya que estos dispositivos son muy sensibles a la radiación. Por todo ello, la confiabilidad debe considerarse como uno de los criterios principales para la toma de decisiones a lo largo del todo flujo de diseño, que debe complementarse con diversos procesos que permitan alcanzar estrictos requisitos de confiabilidad. Primero, la evaluación de la robustez del diseño permite identificar sus puntos débiles, guiando así la definición de mecanismos de tolerancia a fallos. Segundo, la eficacia de los mecanismos definidos debe validarse experimentalmente. Tercero, la evaluación comparativa de la confiabilidad permite a los diseñadores seleccionar los componentes prediseñados (IP), las tecnologías de implementación y las herramientas de diseño (EDA) más adecuadas desde la perspectiva de la confiabilidad. Por último, la exploración del espacio de diseño (DSE) permite configurar de manera óptima los componentes y las herramientas seleccionados, mejorando así la confiabilidad y las métricas PPA de la implementación resultante. Todos los procesos anteriormente mencionados se basan en técnicas de inyección de fallos para evaluar la robustez del sistema diseñado. A pesar de que existe una amplia variedad de técnicas de inyección de fallos, varias problemas aún deben abordarse para cubrir las necesidades planteadas en el flujo de diseño. Aquellas soluciones basadas en simulación (SBFI) deben adaptarse a los modelos de nivel de implementación, teniendo en cuenta la arquitectura de los diversos componentes de la tecnología utilizada. Las técnicas de inyección de fallos basadas en FPGAs (FFI) deben abordar problemas relacionados con la granularidad del análisis para poder localizar los puntos débiles del diseño. Otro desafío es la reducción del coste temporal de los experimentos de inyección de fallos. Debido a la alta complejidad de los diseños actuales, el tiempo experimental dedicado a la evaluación de la confiabilidad puede ser excesivo incluso en aquellos escenarios más simples, mientras que puede ser inviable en aquellos procesos relacionados con la evaluación de múltiples configuraciones alternativas del diseño. Por último, estos procesos orientados a la confiabilidad carecen de un soporte instrumental que permita cubrir el flujo de diseño con toda su variedad de lenguajes de descripción de hardware, tecnologías de implementación y herramientas de diseño. Esta tesis aborda los retos anteriormente mencionados con el fin de integrar, de manera eficaz, estos procesos orientados a la confiabilidad en el flujo de diseño. Primeramente, se proponen nuevos métodos de inyección de fallos que permiten una evaluación de la confiabilidad, precisa y detallada, en diferentes niveles del flujo de diseño. Segundo, se definen nuevas técnicas para la aceleración de los experimentos de inyección que mejoran su coste temporal. Tercero, se define dos estrategias DSE que permiten configurar de manera óptima (desde la perspectiva de la confiabilidad) los componentes IP y las herramientas EDA, con un coste experimental mínimo. Cuarto, se propone un kit de herramientas que automatiza e incorpora con eficacia los procesos orientados a la confiabilidad en el flujo de diseño semicustom. Finalmente, se demuestra la utilidad y eficacia de las propuestas mediante un caso de estudio en el que se implementan tres procesadores empotrados en un FPGA de Xilinx serie 7.[CA] La utilització de sistemes encastats en cada vegada més àmbits d'aplicació està portant al fet que el seu disseny haja d'enfrontar-se a majors requisits de rendiment, consum d'energia i àrea (PPA). Així mateix, la seua utilització en aplicacions crítiques provoca que hagen de complir amb estrictes requisits de confiabilitat per a garantir el seu correcte funcionament durant períodes prolongats de temps. En particular, l'ús de dispositius lògics programables de tipus FPGA és un gran desafiament des de la perspectiva de la confiabilitat, ja que aquests dispositius són molt sensibles a la radiació. Per tot això, la confiabilitat ha de considerar-se com un dels criteris principals per a la presa de decisions al llarg del tot flux de disseny, que ha de complementar-se amb diversos processos que permeten aconseguir estrictes requisits de confiabilitat. Primer, l'avaluació de la robustesa del disseny permet identificar els seus punts febles, guiant així la definició de mecanismes de tolerància a fallades. Segon, l'eficàcia dels mecanismes definits ha de validar-se experimentalment. Tercer, l'avaluació comparativa de la confiabilitat permet als dissenyadors seleccionar els components predissenyats (IP), les tecnologies d'implementació i les eines de disseny (EDA) més adequades des de la perspectiva de la confiabilitat. Finalment, l'exploració de l'espai de disseny (DSE) permet configurar de manera òptima els components i les eines seleccionats, millorant així la confiabilitat i les mètriques PPA de la implementació resultant. Tots els processos anteriorment esmentats es basen en tècniques d'injecció de fallades per a poder avaluar la robustesa del sistema dissenyat. A pesar que existeix una àmplia varietat de tècniques d'injecció de fallades, diverses problemes encara han d'abordar-se per a cobrir les necessitats plantejades en el flux de disseny. Aquelles solucions basades en simulació (SBFI) han d'adaptar-se als models de nivell d'implementació, tenint en compte l'arquitectura dels diversos components de la tecnologia utilitzada. Les tècniques d'injecció de fallades basades en FPGAs (FFI) han d'abordar problemes relacionats amb la granularitat de l'anàlisi per a poder localitzar els punts febles del disseny. Un altre desafiament és la reducció del cost temporal dels experiments d'injecció de fallades. A causa de l'alta complexitat dels dissenys actuals, el temps experimental dedicat a l'avaluació de la confiabilitat pot ser excessiu fins i tot en aquells escenaris més simples, mentre que pot ser inviable en aquells processos relacionats amb l'avaluació de múltiples configuracions alternatives del disseny. Finalment, aquests processos orientats a la confiabilitat manquen d'un suport instrumental que permeta cobrir el flux de disseny amb tota la seua varietat de llenguatges de descripció de maquinari, tecnologies d'implementació i eines de disseny. Aquesta tesi aborda els reptes anteriorment esmentats amb la finalitat d'integrar, de manera eficaç, aquests processos orientats a la confiabilitat en el flux de disseny. Primerament, es proposen nous mètodes d'injecció de fallades que permeten una avaluació de la confiabilitat, precisa i detallada, en diferents nivells del flux de disseny. Segon, es defineixen noves tècniques per a l'acceleració dels experiments d'injecció que milloren el seu cost temporal. Tercer, es defineix dues estratègies DSE que permeten configurar de manera òptima (des de la perspectiva de la confiabilitat) els components IP i les eines EDA, amb un cost experimental mínim. Quart, es proposa un kit d'eines (DAVOS) que automatitza i incorpora amb eficàcia els processos orientats a la confiabilitat en el flux de disseny semicustom. Finalment, es demostra la utilitat i eficàcia de les propostes mitjançant un cas d'estudi en el qual s'implementen tres processadors encastats en un FPGA de Xilinx serie 7.[EN] Embedded systems are steadily extending their application areas, dealing with increasing requirements in performance, power consumption, and area (PPA). Whenever embedded systems are used in safety-critical applications, they must also meet rigorous dependability requirements to guarantee their correct operation during an extended period of time. Meeting these requirements is especially challenging for those systems that are based on Field Programmable Gate Arrays (FPGAs), since they are very susceptible to Single Event Upsets. This leads to increased dependability threats, especially in harsh environments. In such a way, dependability should be considered as one of the primary criteria for decision making throughout the whole design flow, which should be complemented by several dependability-driven processes. First, dependability assessment quantifies the robustness of hardware designs against faults and identifies their weak points. Second, dependability-driven verification ensures the correctness and efficiency of fault mitigation mechanisms. Third, dependability benchmarking allows designers to select (from a dependability perspective) the most suitable IP cores, implementation technologies, and electronic design automation (EDA) tools. Finally, dependability-aware design space exploration (DSE) allows to optimally configure the selected IP cores and EDA tools to improve as much as possible the dependability and PPA features of resulting implementations. The aforementioned processes rely on fault injection testing to quantify the robustness of the designed systems. Despite nowadays there exists a wide variety of fault injection solutions, several important problems still should be addressed to better cover the needs of a dependability-driven design flow. In particular, simulation-based fault injection (SBFI) should be adapted to implementation-level HDL models to take into account the architecture of diverse logic primitives, while keeping the injection procedures generic and low-intrusive. Likewise, the granularity of FPGA-based fault injection (FFI) should be refined to the enable accurate identification of weak points in FPGA-based designs. Another important challenge, that dependability-driven processes face in practice, is the reduction of SBFI and FFI experimental effort. The high complexity of modern designs raises the experimental effort beyond the available time budgets, even in simple dependability assessment scenarios, and it becomes prohibitive in presence of alternative design configurations. Finally, dependability-driven processes lack an instrumental support covering the semicustom design flow in all its variety of description languages, implementation technologies, and EDA tools. Existing fault injection tools only partially cover the individual stages of the design flow, being usually specific to a particular design representation level and implementation technology. This work addresses the aforementioned challenges by efficiently integrating dependability-driven processes into the design flow. First, it proposes new SBFI and FFI approaches that enable an accurate and detailed dependability assessment at different levels of the design flow. Second, it improves the performance of dependability-driven processes by defining new techniques for accelerating SBFI and FFI experiments. Third, it defines two DSE strategies that enable the optimal dependability-aware tuning of IP cores and EDA tools, while reducing as much as possible the robustness evaluation effort. Fourth, it proposes a new toolkit (DAVOS) that automates and seamlessly integrates the aforementioned dependability-driven processes into the semicustom design flow. Finally, it illustrates the usefulness and efficiency of these proposals through a case study consisting of three soft-core embedded processors implemented on a Xilinx 7-series SoC FPGA.Tuzov, I. (2020). Dependability-driven Strategies to Improve the Design and Verification of Safety-Critical HDL-based Embedded Systems [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/159883TESI

    Adaptive Intelligent Systems for Extreme Environments

    Get PDF
    As embedded processors become powerful, a growing number of embedded systems equipped with artificial intelligence (AI) algorithms have been used in radiation environments to perform routine tasks to reduce radiation risk for human workers. On the one hand, because of the low price, commercial-off-the-shelf devices and components are becoming increasingly popular to make such tasks more affordable. Meanwhile, it also presents new challenges to improve radiation tolerance, the capability to conduct multiple AI tasks and deliver the power efficiency of the embedded systems in harsh environments. There are three aspects of research work that have been completed in this thesis: 1) a fast simulation method for analysis of single event effect (SEE) in integrated circuits, 2) a self-refresh scheme to detect and correct bit-flips in random access memory (RAM), and 3) a hardware AI system with dynamic hardware accelerators and AI models for increasing flexibility and efficiency. The variances of the physical parameters in practical implementation, such as the nature of the particle, linear energy transfer and circuit characteristics, may have a large impact on the final simulation accuracy, which will significantly increase the complexity and cost in the workflow of the transistor level simulation for large-scale circuits. It makes it difficult to conduct SEE simulations for large-scale circuits. Therefore, in the first research work, a new SEE simulation scheme is proposed, to offer a fast and cost-efficient method to evaluate and compare the performance of large-scale circuits which subject to the effects of radiation particles. The advantages of transistor and hardware description language (HDL) simulations are combined here to produce accurate SEE digital error models for rapid error analysis in large-scale circuits. Under the proposed scheme, time-consuming back-end steps are skipped. The SEE analysis for large-scale circuits can be completed in just few hours. In high-radiation environments, bit-flips in RAMs can not only occur but may also be accumulated. However, the typical error mitigation methods can not handle high error rates with low hardware costs. In the second work, an adaptive scheme combined with correcting codes and refreshing techniques is proposed, to correct errors and mitigate error accumulation in extreme radiation environments. This scheme is proposed to continuously refresh the data in RAMs so that errors can not be accumulated. Furthermore, because the proposed design can share the same ports with the user module without changing the timing sequence, it thus can be easily applied to the system where the hardware modules are designed with fixed reading and writing latency. It is a challenge to implement intelligent systems with constrained hardware resources. In the third work, an adaptive hardware resource management system for multiple AI tasks in harsh environments was designed. Inspired by the “refreshing” concept in the second work, we utilise a key feature of FPGAs, partial reconfiguration, to improve the reliability and efficiency of the AI system. More importantly, this feature provides the capability to manage the hardware resources for deep learning acceleration. In the proposed design, the on-chip hardware resources are dynamically managed to improve the flexibility, performance and power efficiency of deep learning inference systems. The deep learning units provided by Xilinx are used to perform multiple AI tasks simultaneously, and the experiments show significant improvements in power efficiency for a wide range of scenarios with different workloads. To further improve the performance of the system, the concept of reconfiguration was further extended. As a result, an adaptive DL software framework was designed. This framework can provide a significant level of adaptability support for various deep learning algorithms on an FPGA-based edge computing platform. To meet the specific accuracy and latency requirements derived from the running applications and operating environments, the platform may dynamically update hardware and software (e.g., processing pipelines) to achieve better cost, power, and processing efficiency compared to the static system

    Study of the effects of SEU-induced faults on a pipeline protected microprocessor

    Get PDF
    This paper presents a detailed analysis of the behavior of a novel fault-tolerant 32-bit embedded CPU as compared to a default (non-fault-tolerant) implementation of the same processor during a fault injection campaign of single and double faults. The fault-tolerant processor tested is characterized by per-cycle voting of microarchitectural and the flop-based architectural states, redundancy at the pipeline level, and a distributed voting scheme. Its fault-tolerant behavior is characterized for three different workloads from the automotive application domain. The study proposes statistical methods for both the single and dual fault injection campaigns and demonstrates the fault-tolerant capability of both processors in terms of fault latencies, the probability of fault manifestation, and the behavior of latent faults

    Single event upset hardened embedded domain specific reconfigurable architecture

    Get PDF
    corecore