6 research outputs found

    Compensating for Timing Jitter in Computing Systems with General-Purpose Operating Systems

    Full text link

    National Computational Infrastructure for Lattice Gauge Theory SciDAC-2 Closeout Report

    Get PDF
    Under its SciDAC-1 and SciDAC-2 grants, the USQCD Collaboration developed software and algorithmic infrastructure for the numerical study of lattice gauge theories

    Observer-based Anomaly Diagnosis and Mitigation for Cyber-Physical Systems

    Full text link
    Cyber-Physical Systems (CPS) seamlessly integrate computational devices, communication networks, and physical processes. The performance and functionality of many critical infrastructures such as power, traffic, and health-care networks and smart cities rely on advances in CPS. However, higher connectivity increases the vulnerability of CPS because it exposes them to threats from both the cyber domain and the physical domain. An attack or a fault within the cyber or physical domain can subsequently affect the cyber domain, the physical domain, or both, resulting in anomalies. An attack or a fault on CPS can have serious or even lethal consequences. Traditional anomaly diagnosis techniques mainly focus on cyber-to-cyber or physical-to-physical interactions. However, in practice they can often be subverted in the face of cross-domain attacks or faults. In summary, the safety and reliability of CPS become more and more crucial every day and existing techniques to diagnose or mitigate CPS attacks and faults are not sufficient to eliminate vulnerability. The motivation of this dissertation is to enhance anomaly diagnosis and mitigation for CPS, covering physical-to-physical and cyber-to-physical attacks or faults. With the advantage of dealing with system uncertainties and providing system state estimation, observer-based anomaly diagnosis is of great interest. The first task is to design a multiple observers framework to diagnose sensor anomalies for continuous systems. Since CPS contain both continuous and discrete variables, CPS are modeled as hybrid systems. Utilizing the relationship between the continuous and discrete variables, a conflict-driven hybrid observer-based anomaly detection method is proposed, which checks for conflicts between the continuous and discrete variables to detect anomalies. Lastly, the observer design for hybrid systems is improved to enable observer-based anomaly diagnosis for a wider class of hybrid systems. The novel observer-based anomaly diagnosis and mitigation approaches introduced in this dissertation can not only diagnose anomalies caused by traditional faults, but also anomalies caused by sophisticated attacks. This research work can benefit the overall security of critical infrastructures, preventing disastrous consequences and reducing economic loss. The effectiveness of the proposed approaches is demonstrated mathematically and illustrated through applications to various simulated systems, including a suspension system, the Positive Train Control system and a microgrid system.PHDMechanical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/147576/1/zhengwa_1.pd

    Tolerância a falhas em sistemas de tempo-real distribuídos e embebidos

    Get PDF
    Este documento descreve um modelo de tolerância a falhas para sistemas de tempo-real distribuídos. A sugestão deste modelo tem como propósito a apresentação de uma solu-ção fiável, flexível e adaptável às necessidades dos sistemas de tempo-real distribuídos. A tolerância a falhas é um aspeto extremamente importante na construção de sistemas de tempo-real e a sua aplicação traz inúmeros benefícios. Um design orientado para a to-lerância a falhas contribui para um melhor desempenho do sistema através do melhora-mento de aspetos chave como a segurança, a confiabilidade e a disponibilidade dos sis-temas. O trabalho desenvolvido centra-se na prevenção, deteção e tolerância a falhas de tipo ló-gicas (software) e físicas (hardware) e assenta numa arquitetura maioritariamente basea-da no tempo, conjugada com técnicas de redundância. O modelo preocupa-se com a efi-ciência e os custos de execução. Para isso utilizam-se também técnicas tradicionais de to-lerância a falhas, como a redundância e a migração, no sentido de não prejudicar o tempo de execução do serviço, ou seja, diminuindo o tempo de recuperação das réplicas, em ca-so de ocorrência de falhas. Neste trabalho são propostas heurísticas de baixa complexida-de para tempo-de-execução, a fim de se determinar para onde replicar os componentes que constituem o software de tempo-real e de negociá-los num mecanismo de coordena-ção por licitações. Este trabalho adapta e estende alguns algoritmos que fornecem solu-ções ainda que interrompidos. Estes algoritmos são referidos em trabalhos de investiga-ção relacionados, e são utilizados para formação de coligações entre nós coadjuvantes. O modelo proposto colmata as falhas através de técnicas de replicação ativa, tanto virtual como física, com blocos de execução concorrentes. Tenta-se melhorar ou manter a sua qualidade produzida, praticamente sem introduzir overhead de informação significativo no sistema. O modelo certifica-se que as máquinas escolhidas, para as quais os agentes migrarão, melhoram iterativamente os níveis de qualidade de serviço fornecida aos com-ponentes, em função das disponibilidades das respetivas máquinas. Caso a nova configu-ração de qualidade seja rentável para a qualidade geral do serviço, é feito um esforço no sentido de receber novos componentes em detrimento da qualidade dos já hospedados localmente. Os nós que cooperam na coligação maximizam o número de execuções para-lelas entre componentes paralelos que compõem o serviço, com o intuito de reduzir atra-sos de execução. O desenvolvimento desta tese conduziu ao modelo proposto e aos resultados apresenta-dos e foi genuinamente suportado por levantamentos bibliográficos de trabalhos de in-vestigação e desenvolvimento, literaturas e preliminares matemáticos. O trabalho tem também como base uma lista de referências bibliográficas.This document describes a fault-tolerant model for real-time distributed systems. The proposal of this model intends to present a trustworthy, flexible and adaptable solution to meet real-time distributed systems main needs. Fault-tolerance is an extremely important feature in real-time systems design and its im-plementation has countless advantages. A fault-tolerance-oriented design contributes de-cisively to the overall system with the improvement of key-aspects like security, reliability and systems’ availability. The developed work focuses in preventing, detecting as well as tolerating both logical (software) and physical (hardware) faults and has its basis on a majorly time-based archi-tecture, united with redundancy techniques. It also aims at the cost-effectiveness of the execution therefore using several other traditional fault-tolerance techniques like redun-dancy, absent jeopardizing service execution time, and always trying to shorten replica re-covery time, in faulty situations. In this work are proposed low runtime complexity heuris-tics to determine where to replicate components that compose the real-time software and to negotiate them in an auction-based coordination. This work makes progress on some algorithms that provide a valid solution even if they are interrupted. These algo-rithms are referred in related investigations works, in order to accomplish coalition for-mations between mutual supporting nodes. This proposed model fills in possible gaps through virtual and physical active replication techniques, applying parallel execution blocks, in the attempt of improve or maintain the produced quality, quasi without creating significant overhead in the system. The proposed model ensures that the chosen machines, to which agents will migrate, improve progres-sively the quality of service levels provided to the components, according to the respec-tive hosts’ availabilities. It always makes an effort to receive incoming components at the cost of degrading others already hosted locally, if the new quality configuration elevates the service overall quality. The cooperating coalition nodes maximize the number of paral-lel executions between parallel components, in order to reduce execution delays. This thesis development leaded to the proposed model and presented results and was genuinely supported by research and development scientific works, detailed literature survey and mathematical preliminaries. This work is also supported by a list of necessary references
    corecore