488 research outputs found

    Cross-layer Soft Error Analysis and Mitigation at Nanoscale Technologies

    Get PDF
    This thesis addresses the challenge of soft error modeling and mitigation in nansoscale technology nodes and pushes the state-of-the-art forward by proposing novel modeling, analyze and mitigation techniques. The proposed soft error sensitivity analysis platform accurately models both error generation and propagation starting from a technology dependent device level simulations all the way to workload dependent application level analysis

    Review of Fault Mitigation Approaches for Deep Neural Networks for Computer Vision in Autonomous Driving

    Get PDF
    The aim of this work is to identify and present challenges and risks related to the employment of DNNs in Computer Vision for Autonomous Driving. Nowadays one of the major technological challenges is to choose the right technology among the abundance that is available on the market. Specifically, in this thesis it is collected a synopsis of the state-of-the-art architectures, techniques and methodologies adopted for building fault-tolerant hardware and ensuring robustness in DNNs-based Computer Vision applications for Autonomous Driving

    Resilience Strategies for Network Challenge Detection, Identification and Remediation

    Get PDF
    The enormous growth of the Internet and its use in everyday life make it an attractive target for malicious users. As the network becomes more complex and sophisticated it becomes more vulnerable to attack. There is a pressing need for the future internet to be resilient, manageable and secure. Our research is on distributed challenge detection and is part of the EU Resumenet Project (Resilience and Survivability for Future Networking: Framework, Mechanisms and Experimental Evaluation). It aims to make networks more resilient to a wide range of challenges including malicious attacks, misconfiguration, faults, and operational overloads. Resilience means the ability of the network to provide an acceptable level of service in the face of significant challenges; it is a superset of commonly used definitions for survivability, dependability, and fault tolerance. Our proposed resilience strategy could detect a challenge situation by identifying an occurrence and impact in real time, then initiating appropriate remedial action. Action is autonomously taken to continue operations as much as possible and to mitigate the damage, and allowing an acceptable level of service to be maintained. The contribution of our work is the ability to mitigate a challenge as early as possible and rapidly detect its root cause. Also our proposed multi-stage policy based challenge detection system identifies both the existing and unforeseen challenges. This has been studied and demonstrated with an unknown worm attack. Our multi stage approach reduces the computation complexity compared to the traditional single stage, where one particular managed object is responsible for all the functions. The approach we propose in this thesis has the flexibility, scalability, adaptability, reproducibility and extensibility needed to assist in the identification and remediation of many future network challenges

    Dependability analysis of parallel systems using a simulation-based approach

    Get PDF
    The analysis of dependability in large, complex, parallel systems executing real applications or workloads is examined in this thesis. To effectively demonstrate the wide range of dependability problems that can be analyzed through simulation, the analysis of three case studies is presented. For each case, the organization of the simulation model used is outlined, and the results from simulated fault injection experiments are explained, showing the usefulness of this method in dependability modeling of large parallel systems. The simulation models are constructed using DEPEND and C++. Where possible, methods to increase dependability are derived from the experimental results. Another interesting facet of all three cases is the presence of some kind of workload of application executing in the simulation while faults are injected. This provides a completely new dimension to this type of study, not possible to model accurately with analytical approaches

    Prognostics and health management of power electronics

    Get PDF
    Prognostics and health management (PHM) is a major tool enabling systems to evaluate their reliability in real-time operation. Despite ground-breaking advances in most engineering and scientific disciplines during the past decades, reliability engineering has not seen significant breakthroughs or noticeable advances. Therefore, self-awareness of the embedded system is also often required in the sense that the system should be able to assess its own health state and failure records, and those of its main components, and take action appropriately. This thesis presents a radically new prognostics approach to reliable system design that will revolutionise complex power electronic systems with robust prognostics capability enhanced Insulated Gate Bipolar Transistors (IGBT) in applications where reliability is significantly challenging and critical. The IGBT is considered as one of the components that is mainly damaged in converters and experiences a number of failure mechanisms, such as bond wire lift off, die attached solder crack, loose gate control voltage, etc. The resulting effects mentioned are complex. For instance, solder crack growth results in increasing the IGBT’s thermal junction which becomes a source of heat turns to wire bond lift off. As a result, the indication of this failure can be seen often in increasing on-state resistance relating to the voltage drop between on-state collector-emitter. On the other hand, hot carrier injection is increased due to electrical stress. Additionally, IGBTs are components that mainly work under high stress, temperature and power consumptions due to the higher range of load that these devices need to switch. This accelerates the degradation mechanism in the power switches in discrete fashion till reaches failure state which fail after several hundred cycles. To this end, exploiting failure mechanism knowledge of IGBTs and identifying failure parameter indication are background information of developing failure model and prognostics algorithm to calculate remaining useful life (RUL) along with ±10% confidence bounds. A number of various prognostics models have been developed for forecasting time to failure of IGBTs and the performance of the presented estimation models has been evaluated based on two different evaluation metrics. The results show significant improvement in health monitoring capability for power switches.Furthermore, the reliability of the power switch was calculated and conducted to fully describe health state of the converter and reconfigure the control parameter using adaptive algorithm under degradation and load mission limitation. As a result, the life expectancy of devices has been increased. These all allow condition-monitoring facilities to minimise stress levels and predict future failure which greatly reduces the likelihood of power switch failures in the first place

    Micro-Policies: Formally Verified, Tag-Based Security Monitors

    Get PDF
    Recent advances in hardware design have demonstrated mechanisms allowing a wide range of low-level security policies (or micro-policies) to be expressed using rules on metadata tags. We propose a methodology for defining and reasoning about such tag-based reference monitors in terms of a high-level “symbolic machine,” and we use this methodology to define and formally verify micro-policies for dynamic sealing, compartmentalization, control-flow integrity, and memory safety; in addition, we show how to use the tagging mechanism to protect its own integrity. For each micro-policy, we prove by refinement that the symbolic machine instantiated with the policy’s rules embodies a high-level specification characterizing a useful security property. Last, we show how the symbolic machine itself can be implemented in terms of a hardware rule cache and a software controller

    Combining symbolic conflict recognition with Markov Chains for fault identification

    Get PDF
    corecore