1,070 research outputs found

    Hardware-Assisted Dependable Systems

    Get PDF
    Unpredictable hardware faults and software bugs lead to application crashes, incorrect computations, unavailability of internet services, data losses, malfunctioning components, and consequently financial losses or even death of people. In particular, faults in microprocessors (CPUs) and memory corruption bugs are among the major unresolved issues of today. CPU faults may result in benign crashes and, more problematically, in silent data corruptions that can lead to catastrophic consequences, silently propagating from component to component and finally shutting down the whole system. Similarly, memory corruption bugs (memory-safety vulnerabilities) may result in a benign application crash but may also be exploited by a malicious hacker to gain control over the system or leak confidential data. Both these classes of errors are notoriously hard to detect and tolerate. Usual mitigation strategy is to apply ad-hoc local patches: checksums to protect specific computations against hardware faults and bug fixes to protect programs against known vulnerabilities. This strategy is unsatisfactory since it is prone to errors, requires significant manual effort, and protects only against anticipated faults. On the other extreme, Byzantine Fault Tolerance solutions defend against all kinds of hardware and software errors, but are inadequately expensive in terms of resources and performance overhead. In this thesis, we examine and propose five techniques to protect against hardware CPU faults and software memory-corruption bugs. All these techniques are hardware-assisted: they use recent advancements in CPU designs and modern CPU extensions. Three of these techniques target hardware CPU faults and rely on specific CPU features: ∆-encoding efficiently utilizes instruction-level parallelism of modern CPUs, Elzar re-purposes Intel AVX extensions, and HAFT builds on Intel TSX instructions. The rest two target software bugs: SGXBounds detects vulnerabilities inside Intel SGX enclaves, and “MPX Explained” analyzes the recent Intel MPX extension to protect against buffer overflow bugs. Our techniques achieve three goals: transparency, practicality, and efficiency. All our systems are implemented as compiler passes which transparently harden unmodified applications against hardware faults and software bugs. They are practical since they rely on commodity CPUs and require no specialized hardware or operating system support. Finally, they are efficient because they use hardware assistance in the form of CPU extensions to lower performance overhead

    GPGPU Reliability Analysis: From Applications to Large Scale Systems

    Get PDF
    Over the past decade, GPUs have become an integral part of mainstream high-performance computing (HPC) facilities. Since applications running on HPC systems are usually long-running, any error or failure could result in significant loss in scientific productivity and system resources. Even worse, since HPC systems face severe resilience challenges as progressing towards exascale computing, it is imperative to develop a better understanding of the reliability of GPUs. This dissertation fills this gap by providing an understanding of the effects of soft errors on the entire system and on specific applications. To understand system-level reliability, a large-scale study on GPU soft errors in the field is conducted. The occurrences of GPU soft errors are linked to several temporal and spatial features, such as specific workloads, node location, temperature, and power consumption. Further, machine learning models are proposed to predict error occurrences on GPU nodes so as to proactively and dynamically turning on/off the costly error protection mechanisms based on prediction results. To understand the effects of soft errors at the application level, an effective fault-injection framework is designed aiming to understand the reliability and resilience characteristics of GPGPU applications. This framework is effective in terms of reducing the tremendous number of fault injection locations to a manageable size while still preserving remarkable accuracy. This framework is validated with both single-bit and multi-bit fault models for various GPGPU benchmarks. Lastly, taking advantage of the proposed fault-injection framework, this dissertation develops a hierarchical approach to understanding the error resilience characteristics of GPGPU applications at kernel, CTA, and warp levels. In addition, given that some corrupted application outputs due to soft errors may be acceptable, we present a use case to show how to enable low-overhead yet reliable GPU computing for GPGPU applications

    Software-based and regionally-oriented traffic management in Networks-on-Chip

    Get PDF
    Since the introduction of chip-multiprocessor systems, the number of integrated cores has been steady growing and workload applications have been adapted to exploit the increasing parallelism. This changed the importance of efficient on-chip communication significantly and the infrastructure has to keep step with these new requirements. The work at hand makes significant contributions to the state-of-the-art of the latest generation of such solutions, called Networks-on-Chip, to improve the performance, reliability, and flexible management of these on-chip infrastructures

    Radiation Tolerant Electronics, Volume II

    Get PDF
    Research on radiation tolerant electronics has increased rapidly over the last few years, resulting in many interesting approaches to model radiation effects and design radiation hardened integrated circuits and embedded systems. This research is strongly driven by the growing need for radiation hardened electronics for space applications, high-energy physics experiments such as those on the large hadron collider at CERN, and many terrestrial nuclear applications, including nuclear energy and safety management. With the progressive scaling of integrated circuit technologies and the growing complexity of electronic systems, their ionizing radiation susceptibility has raised many exciting challenges, which are expected to drive research in the coming decade.After the success of the first Special Issue on Radiation Tolerant Electronics, the current Special Issue features thirteen articles highlighting recent breakthroughs in radiation tolerant integrated circuit design, fault tolerance in FPGAs, radiation effects in semiconductor materials and advanced IC technologies and modelling of radiation effects

    Development of a fault tolerant MOS field effect power semiconductor switching transistor

    Get PDF
    This work describes the development of a semiconductor switch to replace an electromechanical contactor as used within the electrical power distribution system of the More Electric Aircraft (MEA; a project begun in the 1990‟s by the United States Air Force). The MEA is safety critical and therefore requires highest reliability components and systems, but subsequent to a short circuit load fault the electro-mechanical contactor switch often welds shut. This risk is increased when using high discharge energy lithium ion dc batteries. Predominately the semiconductor switch controls inductive loads and is required to safely turn off current of up to 10 times the nominal level during sporadic load fault events. The switch requires the lowest static loss (lowest on state resistance), but also the lowest dynamic loss (losses due to the switching event). Presently, unipolar devices provide the lowest dynamic loss, but bipolar devices provide the lowest static loss. One possible solution is use of a Metal Oxide Semiconductor Field Effect Transistor (MOSFET), the area of which is sized to suit the fault current, but at relatively high cost in terms of silicon area. The resultant area is typically achieved by several die connected in parallel, unfortunately, such a solution suffers from current share imbalance and the potential of cascade die failure. The use of a parallel combination of unipolar and bipolar device types (MOSFET and Insulated Gate Bipolar Transistors, IGBTs) to form a hybrid appears to offer the potential to reduce the silicon area, and static loss, whilst reducing the impact of the increased dynamic losses of the IGBT. Unfortunately, this goal requires optimised gate timing of the resultant hybrid which proves challenging if the load current is to be shared appropriately during fault switching in order to prevent failure. Some form of single MOS (Metal Oxide Semiconductor) gated integrated hybrid device with self biased bipolar injection is therefore required to ensure highest reliability through a non latching design which offers lowest losses under all conditions and achieves an even temperature distribution. In this work the novel concept of the integrated hybrid device has been investigated at a low Blocking Voltage (BV) rating of 100 V, using computer simulation. The three terminal hybrid silicon DMOS (Double diffused Metal Oxide Semiconductor) device utilises a novel merged Schottky p-type injector to provide self biased entry into a reduced static loss bipolar state in the event of high fault current. The device achieves a specific on state resistance, R(ON,SP) = 1.16 mΩcm2 in bipolar mode (with BV=84 V), that is below the silicon limit line and requires half the area of a traditional unipolar MOSFET to conduct fault current. During comparative standard unclamped inductive switching trials, the hybrid device provides a self clamping action which enables increased inductive energy switching (higher inductance and/or higher load current), relative to that achieved by either the MOSFET or IGBT. The hybrid conducting in bipolar mode switches an inductive load off much faster than that typically achieved by an IGBT (toff =20 ns, in comparison to typically >10 μs for an IGBT). This results in a low turn off energy for the hybrid (1.26*10-4 J/cm2) as compared to that of the IGBT (8.72*10-3 J/cm2). The hybrid dynamic performance is enhanced by the action of the merged Schottky contact which, unlike the IGBT, acts to limit the emitter base voltage (VEB) of the internal PNP Bipolar Junction Transistor, BJT (the integral PNP BJT is otherwise a shared feature with the IGBT). The self biased bipolar activation is achieved at a forward bias (VAK) =1.3 V at temperature (T)= 300 K. The device is latch up free across the operational temperature range of T=233 K to 400 K. A viable charge balanced structure to increase the BV rating to approximately 600 V is also proposed. The resulting performance of the single gated, self biased, hybrid, utilising a novel merged Schottky/P type injector, could lead to a new class of rugged MOS gated power switching devices in silicon and potentially silicon carbide

    Best Practices and Methodological Guidelines for Conducting Gas Risk Assessments

    Get PDF
    The EC Regulation concerning measures to safeguard security of gas supply (EC/994/2010) requires member states to make a full assessment of the risks affecting the security of gas supply. According to Article 9, this risk assessment must: (a) use the infrastructure and supply standards (articles 6 and 8); (b) take into account all relevant national and regional circumstances; (c) run various disruption scenarios; (d) identify the interaction and correlation of risks with other Member States. (e) take into account the maximal interconnection capacity of each border entry and exit point. The objective of this report is to provide guidance and advice for performing risk assessments. It will do so by first providing a literature review, and then by proposing a basic structure for undertaking a gas security risk assessment, in accordance with best practices and standard procedures found in risk management.JRC.F.3-Energy securit

    Dependable Embedded Systems

    Get PDF
    This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly within the last five years. This book introduces the most prominent reliability concerns from today’s points of view and roughly recapitulates the progress in the community so far. Unlike other books that focus on a single abstraction level such circuit level or system level alone, the focus of this book is to deal with the different reliability challenges across different levels starting from the physical level all the way to the system level (cross-layer approaches). The book aims at demonstrating how new hardware/software co-design solution can be proposed to ef-fectively mitigate reliability degradation such as transistor aging, processor variation, temperature effects, soft errors, etc. Provides readers with latest insights into novel, cross-layer methods and models with respect to dependability of embedded systems; Describes cross-layer approaches that can leverage reliability through techniques that are pro-actively designed with respect to techniques at other layers; Explains run-time adaptation and concepts/means of self-organization, in order to achieve error resiliency in complex, future many core systems

    Realistic adversarial machine learning to improve network intrusion detection

    Get PDF
    Modern organizations can significantly benefit from the use of Artificial Intelligence (AI), and more specifically Machine Learning (ML), to tackle the growing number and increasing sophistication of cyber-attacks targeting their business processes. However, there are several technological and ethical challenges that undermine the trustworthiness of AI. One of the main challenges is the lack of robustness, which is an essential property to ensure that ML is used in a secure way. Improving robustness is no easy task because ML is inherently susceptible to adversarial examples: data samples with subtle perturbations that cause unexpected behaviors in ML models. ML engineers and security practitioners still lack the knowledge and tools to prevent such disruptions, so adversarial examples pose a major threat to ML and to the intelligent Network Intrusion Detection (NID) systems that rely on it. This thesis presents a methodology for a trustworthy adversarial robustness analysis of multiple ML models, and an intelligent method for the generation of realistic adversarial examples in complex tabular data domains like the NID domain: Adaptative Perturbation Pattern Method (A2PM). It is demonstrated that a successful adversarial attack is not guaranteed to be a successful cyber-attack, and that adversarial data perturbations can only be realistic if they are simultaneously valid and coherent, complying with the domain constraints of a real communication network and the class-specific constraints of a certain cyber-attack class. A2PM can be used for adversarial attacks, to iteratively cause misclassifications, and adversarial training, to perform data augmentation with slightly perturbed data samples. Two case studies were conducted to evaluate its suitability for the NID domain. The first verified that the generated perturbations preserved both validity and coherence in Enterprise and Internet-of Things (IoT) network scenarios, achieving realism. The second verified that adversarial training with simple perturbations enables the models to retain a good generalization to regular IoT network traffic flows, in addition to being more robust to adversarial examples. The key takeaway of this thesis is: ML models can be incredibly valuable to improve a cybersecurity system, but their own vulnerabilities must not be disregarded. It is essential to continue the research efforts to improve the security and trustworthiness of ML and of the intelligent systems that rely on it.Organizações modernas podem beneficiar significativamente do uso de Inteligência Artificial (AI), e mais especificamente Aprendizagem Automática (ML), para enfrentar a crescente quantidade e sofisticação de ciberataques direcionados aos seus processos de negócio. No entanto, há vários desafios tecnológicos e éticos que comprometem a confiabilidade da AI. Um dos maiores desafios é a falta de robustez, que é uma propriedade essencial para garantir que se usa ML de forma segura. Melhorar a robustez não é uma tarefa fácil porque ML é inerentemente suscetível a exemplos adversos: amostras de dados com perturbações subtis que causam comportamentos inesperados em modelos ML. Engenheiros de ML e profissionais de segurança ainda não têm o conhecimento nem asferramentas necessárias para prevenir tais disrupções, por isso os exemplos adversos representam uma grande ameaça a ML e aos sistemas de Deteção de Intrusões de Rede (NID) que dependem de ML. Esta tese apresenta uma metodologia para uma análise da robustez de múltiplos modelos ML, e um método inteligente para a geração de exemplos adversos realistas em domínios de dados tabulares complexos como o domínio NID: Método de Perturbação com Padrões Adaptativos (A2PM). É demonstrado que um ataque adverso bem-sucedido não é garantidamente um ciberataque bem-sucedido, e que as perturbações adversas só são realistas se forem simultaneamente válidas e coerentes, cumprindo as restrições de domínio de uma rede de computadores real e as restrições específicas de uma certa classe de ciberataque. A2PM pode ser usado para ataques adversos, para iterativamente causar erros de classificação, e para treino adverso, para realizar aumento de dados com amostras ligeiramente perturbadas. Foram efetuados dois casos de estudo para avaliar a sua adequação ao domínio NID. O primeiro verificou que as perturbações preservaram tanto a validade como a coerência em cenários de redes Empresariais e Internet-das-Coisas (IoT), alcançando o realismo. O segundo verificou que o treino adverso com perturbações simples permitiu aos modelos reter uma boa generalização a fluxos de tráfego de rede IoT, para além de serem mais robustos contra exemplos adversos. A principal conclusão desta tese é: os modelos ML podem ser incrivelmente valiosos para melhorar um sistema de cibersegurança, mas as suas próprias vulnerabilidades não devem ser negligenciadas. É essencial continuar os esforços de investigação para melhorar a segurança e a confiabilidade de ML e dos sistemas inteligentes que dependem de ML
    corecore