888 research outputs found

    Simulation-based Fault Injection with QEMU for Speeding-up Dependability Analysis of Embedded Software

    Get PDF
    Simulation-based fault injection (SFI) represents a valuable solu- tion for early analysis of software dependability and fault tolerance properties before the physical prototype of the target platform is available. Some SFI approaches base the fault injection strategy on cycle-accurate models imple- mented by means of Hardware Description Languages (HDLs). However, cycle- accurate simulation has revealed to be too time-consuming when the objective is to emulate the effect of soft errors on complex microprocessors. To overcome this issue, SFI solutions based on virtual prototypes of the target platform has started to be proposed. However, current approaches still present some draw- backs, like, for example, they work only for specific CPU architectures, or they require code instrumentation, or they have a different target (i.e., design errors instead of dependability analysis). To address these disadvantages, this paper presents an efficient fault injection approach based on QEMU, one of the most efficient and popular instruction-accurate emulator for several microprocessor architectures. As main goal, the proposed approach represents a non intrusive technique for simulating hardware faults affecting CPU behaviours. Perma- nent and transient/intermittent hardware fault models have been abstracted without losing quality for software dependability analysis. The approach mini- mizes the impact of the fault injection procedure in the emulator performance by preserving the original dynamic binary translation mechanism of QEMU. Experimental results for both x86 and ARM processors proving the efficiency and effectiveness of the proposed approach are presented

    New Fault Detection, Mitigation and Injection Strategies for Current and Forthcoming Challenges of HW Embedded Designs

    Full text link
    Tesis por compendio[EN] Relevance of electronics towards safety of common devices has only been growing, as an ever growing stake of the functionality is assigned to them. But of course, this comes along the constant need for higher performances to fulfill such functionality requirements, while keeping power and budget low. In this scenario, industry is struggling to provide a technology which meets all the performance, power and price specifications, at the cost of an increased vulnerability to several types of known faults or the appearance of new ones. To provide a solution for the new and growing faults in the systems, designers have been using traditional techniques from safety-critical applications, which offer in general suboptimal results. In fact, modern embedded architectures offer the possibility of optimizing the dependability properties by enabling the interaction of hardware, firmware and software levels in the process. However, that point is not yet successfully achieved. Advances in every level towards that direction are much needed if flexible, robust, resilient and cost effective fault tolerance is desired. The work presented here focuses on the hardware level, with the background consideration of a potential integration into a holistic approach. The efforts in this thesis have focused several issues: (i) to introduce additional fault models as required for adequate representativity of physical effects blooming in modern manufacturing technologies, (ii) to provide tools and methods to efficiently inject both the proposed models and classical ones, (iii) to analyze the optimum method for assessing the robustness of the systems by using extensive fault injection and later correlation with higher level layers in an effort to cut development time and cost, (iv) to provide new detection methodologies to cope with challenges modeled by proposed fault models, (v) to propose mitigation strategies focused towards tackling such new threat scenarios and (vi) to devise an automated methodology for the deployment of many fault tolerance mechanisms in a systematic robust way. The outcomes of the thesis constitute a suite of tools and methods to help the designer of critical systems in his task to develop robust, validated, and on-time designs tailored to his application.[ES] La relevancia que la electrónica adquiere en la seguridad de los productos ha crecido inexorablemente, puesto que cada vez ésta copa una mayor influencia en la funcionalidad de los mismos. Pero, por supuesto, este hecho viene acompañado de una necesidad constante de mayores prestaciones para cumplir con los requerimientos funcionales, al tiempo que se mantienen los costes y el consumo en unos niveles reducidos. En este escenario, la industria está realizando esfuerzos para proveer una tecnología que cumpla con todas las especificaciones de potencia, consumo y precio, a costa de un incremento en la vulnerabilidad a múltiples tipos de fallos conocidos o la introducción de nuevos. Para ofrecer una solución a los fallos nuevos y crecientes en los sistemas, los diseñadores han recurrido a técnicas tradicionalmente asociadas a sistemas críticos para la seguridad, que ofrecen en general resultados sub-óptimos. De hecho, las arquitecturas empotradas modernas ofrecen la posibilidad de optimizar las propiedades de confiabilidad al habilitar la interacción de los niveles de hardware, firmware y software en el proceso. No obstante, ese punto no está resulto todavía. Se necesitan avances en todos los niveles en la mencionada dirección para poder alcanzar los objetivos de una tolerancia a fallos flexible, robusta, resiliente y a bajo coste. El trabajo presentado aquí se centra en el nivel de hardware, con la consideración de fondo de una potencial integración en una estrategia holística. Los esfuerzos de esta tesis se han centrado en los siguientes aspectos: (i) la introducción de modelos de fallo adicionales requeridos para la representación adecuada de efectos físicos surgentes en las tecnologías de manufactura actuales, (ii) la provisión de herramientas y métodos para la inyección eficiente de los modelos propuestos y de los clásicos, (iii) el análisis del método óptimo para estudiar la robustez de sistemas mediante el uso de inyección de fallos extensiva, y la posterior correlación con capas de más alto nivel en un esfuerzo por recortar el tiempo y coste de desarrollo, (iv) la provisión de nuevos métodos de detección para cubrir los retos planteados por los modelos de fallo propuestos, (v) la propuesta de estrategias de mitigación enfocadas hacia el tratamiento de dichos escenarios de amenaza y (vi) la introducción de una metodología automatizada de despliegue de diversos mecanismos de tolerancia a fallos de forma robusta y sistemática. Los resultados de la presente tesis constituyen un conjunto de herramientas y métodos para ayudar al diseñador de sistemas críticos en su tarea de desarrollo de diseños robustos, validados y en tiempo adaptados a su aplicación.[CA] La rellevància que l'electrònica adquireix en la seguretat dels productes ha crescut inexorablement, puix cada volta més aquesta abasta una major influència en la funcionalitat dels mateixos. Però, per descomptat, aquest fet ve acompanyat d'un constant necessitat de majors prestacions per acomplir els requeriments funcionals, mentre es mantenen els costos i consums en uns nivells reduïts. Donat aquest escenari, la indústria està fent esforços per proveir una tecnologia que complisca amb totes les especificacions de potència, consum i preu, tot a costa d'un increment en la vulnerabilitat a diversos tipus de fallades conegudes, i a la introducció de nous tipus. Per oferir una solució a les noves i creixents fallades als sistemes, els dissenyadors han recorregut a tècniques tradicionalment associades a sistemes crítics per a la seguretat, que en general oferixen resultats sub-òptims. De fet, les arquitectures empotrades modernes oferixen la possibilitat d'optimitzar les propietats de confiabilitat en habilitar la interacció dels nivells de hardware, firmware i software en el procés. Tot i això eixe punt no està resolt encara. Es necessiten avanços a tots els nivells en l'esmentada direcció per poder assolir els objectius d'una tolerància a fallades flexible, robusta, resilient i a baix cost. El treball ací presentat se centra en el nivell de hardware, amb la consideració de fons d'una potencial integració en una estratègia holística. Els esforços d'esta tesi s'han centrat en els següents aspectes: (i) la introducció de models de fallada addicionals requerits per a la representació adequada d'efectes físics que apareixen en les tecnologies de fabricació actuals, (ii) la provisió de ferramentes i mètodes per a la injecció eficient del models proposats i dels clàssics, (iii) l'anàlisi del mètode òptim per estudiar la robustesa de sistemes mitjançant l'ús d'injecció de fallades extensiva, i la posterior correlació amb capes de més alt nivell en un esforç per retallar el temps i cost de desenvolupament, (iv) la provisió de nous mètodes de detecció per cobrir els reptes plantejats pels models de fallades proposats, (v) la proposta d'estratègies de mitigació enfocades cap al tractament dels esmentats escenaris d'amenaça i (vi) la introducció d'una metodologia automatitzada de desplegament de diversos mecanismes de tolerància a fallades de forma robusta i sistemàtica. Els resultats de la present tesi constitueixen un conjunt de ferramentes i mètodes per ajudar el dissenyador de sistemes crítics en la seua tasca de desenvolupament de dissenys robustos, validats i a temps adaptats a la seua aplicació.Espinosa García, J. (2016). New Fault Detection, Mitigation and Injection Strategies for Current and Forthcoming Challenges of HW Embedded Designs [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/73146TESISCompendi

    International White Book on DER Protection : Review and Testing Procedures

    Get PDF
    This white book provides an insight into the issues surrounding the impact of increasing levels of DER on the generator and network protection and the resulting necessary improvements in protection testing practices. Particular focus is placed on ever increasing inverter-interfaced DER installations and the challenges of utility network integration. This white book should also serve as a starting point for specifying DER protection testing requirements and procedures. A comprehensive review of international DER protection practices, standards and recommendations is presented. This is accompanied by the identifi cation of the main performance challenges related to these protection schemes under varied network operational conditions and the nature of DER generator and interface technologies. Emphasis is placed on the importance of dynamic testing that can only be delivered through laboratory-based platforms such as real-time simulators, integrated substation automation infrastructure and fl exible, inverter-equipped testing microgrids. To this end, the combination of fl exible network operation and new DER technologies underlines the importance of utilising the laboratory testing facilities available within the DERlab Network of Excellence. This not only informs the shaping of new protection testing and network integration practices by end users but also enables the process of de-risking new DER protection technologies. In order to support the issues discussed in the white paper, a comparative case study between UK and German DER protection and scheme testing practices is presented. This also highlights the level of complexity associated with standardisation and approval mechanisms adopted by different countries

    Fault-based Analysis of Industrial Cyber-Physical Systems

    Get PDF
    The fourth industrial revolution called Industry 4.0 tries to bridge the gap between traditional Electronic Design Automation (EDA) technologies and the necessity of innovating in many indus- trial fields, e.g., automotive, avionic, and manufacturing. This complex digitalization process in- volves every industrial facility and comprises the transformation of methodologies, techniques, and tools to improve the efficiency of every industrial process. The enhancement of functional safety in Industry 4.0 applications needs to exploit the studies related to model-based and data-driven anal- yses of the deployed Industrial Cyber-Physical System (ICPS). Modeling an ICPS is possible at different abstraction levels, relying on the physical details included in the model and necessary to describe specific system behaviors. However, it is extremely complicated because an ICPS is com- posed of heterogeneous components related to different physical domains, e.g., digital, electrical, and mechanical. In addition, it is also necessary to consider not only nominal behaviors but even faulty behaviors to perform more specific analyses, e.g., predictive maintenance of specific assets. Nevertheless, these faulty data are usually not present or not available directly from the industrial machinery. To overcome these limitations, constructing a virtual model of an ICPS extended with different classes of faults enables the characterization of faulty behaviors of the system influenced by different faults. In literature, these topics are addressed with non-uniformly approaches and with the absence of standardized and automatic methodologies for describing and simulating faults in the different domains composing an ICPS. This thesis attempts to overcome these state-of-the-art gaps by proposing novel methodologies, techniques, and tools to: model and simulate analog and multi-domain systems; abstract low-level models to higher-level behavioral models; and monitor industrial systems based on the Industrial Internet of Things (IIOT) paradigm. Specifically, the proposed contributions involve the exten- sion of state-of-the-art fault injection practices to improve the ICPSs safety, the development of frameworks for safety operations automatization, and the definition of a monitoring framework for ICPSs. Overall, fault injection in analog and digital models is the state of the practice to en- sure functional safety, as mentioned in the ISO 26262 standard specific for the automotive field. Starting from state-of-the-art defects defined for analog descriptions, new defects are proposed to enhance the IEEE P2427 draft standard for analog defect modeling and coverage. Moreover, dif- ferent techniques to abstract a transistor-level model to a behavioral model are proposed to speed up the simulation of faulty circuits. Therefore, unlike the electrical domain, there is no extensive use of fault injection techniques in the mechanical one. Thus, extending the fault injection to the mechanical and thermal fields allows for supporting the definition and evaluation of more reliable safety mechanisms. Hence, a taxonomy of mechanical faults is derived from the electrical domain by exploiting the physical analogies. Furthermore, specific tools are built for automatically instru- menting different descriptions with multi-domain faults. The entire work is proposed as a basis for supporting the creation of increasingly resilient and secure ICPS that need to preserve functional safety in any operating context

    Re-use of tests and arguments for assesing dependable mixed-critically systems

    Get PDF
    The safety assessment of mixed-criticality systems (MCS) is a challenging activity due to system heterogeneity, design constraints and increasing complexity. The foundation for MCSs is the integrated architecture paradigm, where a compact hardware comprises multiple execution platforms and communication interfaces to implement concurrent functions with different safety requirements. Besides a computing platform providing adequate isolation and fault tolerance mechanism, the development of an MCS application shall also comply with the guidelines defined by the safety standards. A way to lower the overall MCS certification cost is to adopt a platform-based design (PBD) development approach. PBD is a model-based development (MBD) approach, where separate models of logic, hardware and deployment support the analysis of the resulting system properties and behaviour. The PBD development of MCSs benefits from a composition of modular safety properties (e.g. modular safety cases), which support the derivation of mixed-criticality product lines. The validation and verification (V&V) activities claim a substantial effort during the development of programmable electronics for safety-critical applications. As for the MCS dependability assessment, the purpose of the V&V is to provide evidences supporting the safety claims. The model-based development of MCSs adds more V&V tasks, because additional analysis (e.g., simulations) need to be carried out during the design phase. During the MCS integration phase, typically hardware-in-the-loop (HiL) plant simulators support the V&V campaigns, where test automation and fault-injection are the key to test repeatability and thorough exercise of the safety mechanisms. This dissertation proposes several V&V artefacts re-use strategies to perform an early verification at system level for a distributed MCS, artefacts that later would be reused up to the final stages in the development process: a test code re-use to verify the fault-tolerance mechanisms on a functional model of the system combined with a non-intrusive software fault-injection, a model to X-in-the-loop (XiL) and code-to-XiL re-use to provide models of the plant and distributed embedded nodes suited to the HiL simulator, and finally, an argumentation framework to support the automated composition and staged completion of modular safety-cases for dependability assessment, in the context of the platform-based development of mixed-criticality systems relying on the DREAMS harmonized platform.La dificultad para evaluar la seguridad de los sistemas de criticidad mixta (SCM) aumenta con la heterogeneidad del sistema, las restricciones de diseño y una complejidad creciente. Los SCM adoptan el paradigma de arquitectura integrada, donde un hardware embebido compacto comprende múltiples plataformas de ejecución e interfaces de comunicación para implementar funciones concurrentes y con diferentes requisitos de seguridad. Además de una plataforma de computación que provea un aislamiento y mecanismos de tolerancia a fallos adecuados, el desarrollo de una aplicación SCM además debe cumplir con las directrices definidas por las normas de seguridad. Una forma de reducir el coste global de la certificación de un SCM es adoptar un enfoque de desarrollo basado en plataforma (DBP). DBP es un enfoque de desarrollo basado en modelos (DBM), en el que modelos separados de lógica, hardware y despliegue soportan el análisis de las propiedades y el comportamiento emergente del sistema diseñado. El desarrollo DBP de SCMs se beneficia de una composición modular de propiedades de seguridad (por ejemplo, casos de seguridad modulares), que facilitan la definición de líneas de productos de criticidad mixta. Las actividades de verificación y validación (V&V) representan un esfuerzo sustancial durante el desarrollo de aplicaciones basadas en electrónica confiable. En la evaluación de la seguridad de un SCM el propósito de las actividades de V&V es obtener las evidencias que apoyen las aseveraciones de seguridad. El desarrollo basado en modelos de un SCM incrementa las tareas de V&V, porque permite realizar análisis adicionales (por ejemplo, simulaciones) durante la fase de diseño. En las campañas de pruebas de integración de un SCM habitualmente se emplean simuladores de planta hardware-in-the-loop (HiL), en donde la automatización de pruebas y la inyección de faltas son la clave para la repetitividad de las pruebas y para ejercitar completamente los mecanismos de tolerancia a fallos. Esta tesis propone diversas estrategias de reutilización de artefactos de V&V para la verificación temprana de un MCS distribuido, artefactos que se emplearán en ulteriores fases del desarrollo: la reutilización de código de prueba para verificar los mecanismos de tolerancia a fallos sobre un modelo funcional del sistema combinado con una inyección de fallos de software no intrusiva, la reutilización de modelo a X-in-the-loop (XiL) y código a XiL para obtener modelos de planta y nodos distribuidos aptos para el simulador HiL y, finalmente, un marco de argumentación para la composición automatizada y la compleción escalonada de casos de seguridad modulares, en el contexto del desarrollo basado en plataformas de sistemas de criticidad mixta empleando la plataforma armonizada DREAMS.Kritikotasun nahastuko sistemen segurtasun ebaluazioa jarduera neketsua da beraien heterogeneotasuna dela eta. Sistema hauen oinarria arkitektura integratuen paradigman datza, non hardware konpaktu batek exekuzio plataforma eta komunikazio interfaze ugari integratu ahal dituen segurtasun baldintza desberdineko funtzio konkurrenteak inplementatzeko. Konputazio plataformek isolamendu eta akatsen aurkako mekanismo egokiak emateaz gain, segurtasun arauek definituriko jarraibideak jarraitu behar dituzte kritikotasun mistodun aplikazioen garapenean. Sistema hauen zertifikazio prozesuaren kostua murrizteko aukera bat plataformetan oinarritutako garapenean (PBD) datza. Garapen planteamendu hau modeloetan oinarrituriko garapena da (MBD) non modeloaren logika, hardware eta garapen desberdinak sistemaren propietateen eta portaeraren aurka aztertzen diren. Kritikotasun mistodun sistemen PBD garapenak etekina ateratzen dio moduluetan oinarrituriko segurtasun propietateei, adibidez: segurtasun kasu modularrak (MSC). Modulu hauek kritikotasun mistodun produktu-lerroak ere hartzen dituzte kontutan. Berifikazio eta balioztatze (V&V) jarduerek esfortzu kontsideragarria eskatzen dute segurtasun-kiritikoetarako elektronika programagarrien garapenean. Kritikotasun mistodun sistemen konfiantzaren ebaluazioaren eta V&V jardueren helburua segurtasun eskariak jasotzen dituzten frogak proportzionatzea da. Kritikotasun mistodun sistemen modelo bidezko garapenek zeregin gehigarriak atxikitzen dizkio V&V jarduerari, fase honetan analisi gehigarriak (hots, simulazioak) zehazten direlako. Bestalde, kritikotasun mistodun sistemen integrazio fasean, hardware-in-the-loop (Hil) simulazio plantek V&V iniziatibak sostengatzen dituzte non testen automatizazioan eta akatsen txertaketan funtsezko jarduerak diren. Jarduera hauek frogen errepikapena eta segurtasun mekanismoak egiaztzea ahalbidetzen dute. Tesi honek V&V artefaktuen berrerabilpenerako estrategiak proposatzen ditu, kritikotasun mistodun sistemen egiaztatze azkarrerako sistema mailan eta garapen prozesuko azken faseetaraino erabili daitezkeenak. Esate baterako, test kodearen berrabilpena akats aurkako mekanismoak egiaztatzeko, modelotik X-in-the-loop (XiL)-ra eta kodetik XiL-rako konbertsioa HiL simulaziorako eta argumentazio egitura bat DREAMS Europear proiektuan definituriko arkitektura estiloan oinarrituriko segurtasun kasu modularrak automatikoki eta gradualki sortzeko

    Analysing system susceptibility to faults with simulation tools

    Get PDF
    In the paper we present original fault simulation tools developed in our Institute. These tools are targeted at system dependability evaluation. They provide mechanisms for detailed and aggregated fault effect analysis. Based on our experience with testing various software applications we outline the most important problems and discuss a sample of simulation results

    Analysis and Mitigation of Soft-Errors on High Performance Embedded GPUs

    Get PDF
    Multiprocessor system-on-chip such as embedded GPUs are becoming very popular in safety-critical applications, such as autonomous and semi-autonomous vehicles. However, these devices can suffer from the effects of soft-errors, such as those produced by radiation effects. These effects are able to generate unpredictable misbehaviors. Fault tolerance oriented to multi-threaded software introduces severe performance degradations due to the redundancy, voting and correction threads operations. In this paper, we propose a new fault injection environment for NVIDIA GPGPU devices and a fault tolerance approach based on error detection and correction threads executed during data transfer operations on embedded GPUs. The fault injection environment is capable of automatically injecting faults into the instructions at SASS level by instrumenting the CUDA binary executable file. The mitigation approach is based on concurrent error detection threads running simultaneously with the memory stream device to host data transfer operations. With several benchmark applications, we evaluate the impact of softerrors classifying Silent Data Corruption, Detection, Unrecoverable Error and Hang. Finally, the proposed mitigation approach has been validated by soft-error fault injection campaigns on an NVIDIA Pascal Architecture GPU controlled by Quad-Core A57 ARM processor (JETSON TX2) demonstrating an advantage of more than 37% with respect to state of the art solution

    Moving Towards Analog Functional Safety

    Get PDF
    Over the past century, the exponential growth of the semiconductor industry has led to the creation of tiny and complex integrated circuits, e.g., sensors, actuators, and smart power systems. Innovative techniques are needed to ensure the correct functionality of analog devices that are ubiquitous in every smart system. The standard ISO 26262 related to functional safety in the automotive context specifies that fault injection is necessary to validate all electronic devices. For decades, standardizing fault modeling, injection and simulation mainly focused on digital circuits and disregarding analog ones. An initial attempt is being made with the IEEE P2427 standard draft standard that started to give this field a structured and formal organization. In this context, new fault models, injection, and abstraction methodologies for analog circuits are proposed in this thesis to enhance this application field. The faults proposed by the IEEE P2427 standard draft standard are initially evaluated to understand the associated fault behaviors during the simulation. Moreover, a novel approach is presented for modeling realistic stuck-on/off defects based on oxide defects. These new defects proposed are required because digital stuck-at-fault models where a transistor is frozen in on-state or offstate may not apply well on analog circuits because even a slight variation could create deviations of several magnitudes. Then, for validating the proposed defects models, a novel predictive fault grouping based on faulty AC matrices is applied to group faults with equivalent behaviors. The proposed fault grouping method is computationally cheap because it avoids performing DC or transient simulations with faults injected and limits itself to faulty AC simulations. Using AC simulations results in two different methods that allow grouping faults with the same frequency response are presented. The first method is an AC-based grouping method that exploits the potentialities of the S-parameters ports. While the second is a Circle-based grouping based on the circle-fitting method applied to the extracted AC matrices. Finally, an open-source framework is presented for the fault injection and manipulation perspective. This framework relies on the shared semantics for reading, writing, or manipulating transistor-level designs. The ultimate goal of the framework is: reading an input design written in a specific syntax and then allowing to write the same design in another syntax. As a use case for the proposed framework, a process of analog fault injection is discussed. This activity requires adding, removing, or replacing nodes, components, or even entire sub-circuits. The framework is entirely written in C++, and its APIs are also interfaced with Python. The entire framework is open-source and available on GitHub. The last part of the thesis presents abstraction methodologies that can abstract transistor level models into Verilog-AMS models and Verilog- AMS piecewise and nonlinear models into C++. These abstracted models can be integrated into heterogeneous systems. The purpose of integration is the simulation of heterogeneous components embedded in a Virtual Platforms (VP) needs to be fast and accurate

    Design and Validation of an Open-Source Closed-Loop Testbed for Artificial Pancreas Systems

    Full text link
    The development of a fully autonomous artificial pancreas system (APS) to independently regulate the glucose levels of a patient with Type 1 diabetes has been a long-standing goal of diabetes research. A significant barrier to progress is the difficulty of testing new control algorithms and safety features, since clinical trials are time- and resource-intensive. To facilitate ease of validation, we propose an open-source APS testbed by integrating APS controllers with two state-of-the-art glucose simulators and a novel fault injection engine. The testbed is able to reproduce the blood glucose trajectories of real patients from a clinical trial conducted over six months. We evaluate the performance of two closed-loop control algorithms (OpenAPS and Basal Bolus) using the testbed and find that more advanced control algorithms are able to keep blood glucose in a safe region 93.49% and 79.46% of the time on average, compared with 66.18% of the time for the clinical trial. The fault injection engine simulates the real recalls and adverse events reported to the U.S. Food and Drug Administration (FDA) and demonstrates the resilience of the controller in hazardous conditions. We used the testbed to generate 2.5 years of synthetic data representing 20 different patient profiles with realistic adverse event scenarios, which would have been expensive and risky to collect in a clinical trial. The proposed testbed is a valid tool that can be used by the research community to demonstrate the effectiveness of different control algorithms and safety features for APS.Comment: 12 pages, 12 figures, to appear in the IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE), 202
    corecore