369 research outputs found

    Distributed algorithms for hard real-time systems

    Get PDF
    viii+124hlm.;24c

    Fault-Tolerant Distributed Services in Message-Passing Systems

    Get PDF
    Distributed systems ranging from small local area networks to large wide area networks like the Internet composed of static and/or mobile users have become increasingly popular. A desirable property for any distributed service is fault-tolerance, which means the service remains uninterrupted even if some components in the network fail. This dissertation considers weak distributed models to find either algorithms to solve certain problems or impossibility proofs to show that a problem is unsolvable. These are the main contributions of this dissertation: • Failure detectors are used as a service to solve consensus (agreement among nodes) which is otherwise impossible in failure-prone asynchronous systems. We find an algorithm for crash-failure detection that uses bounded size messages in an arbitrary, partitionable network composed of badly- behaved channels that can lose and reorder messages. • Registers are a fundamental building block for shared memory emulations on top of message passing systems. The problem has been extensively studied in static systems. However, register emulation in dynamic systems with faulty nodes is still quite hard and there are impossibility proofs that point out scenarios where change in the system composition due to nodes entering and leaving (also called churn) makes the problem unsolvable. We propose the first emulation of a crash-fault tolerant register in a system with continuous churn where consensus is unsolvable, the size of the system can grow without bound and at most a constant fraction of the number of nodes in the system can fail by crashing. We prove a lower bound that states that fault-tolerance for dynamic systems with churn is inherently lower than in static systems. • We then extend the results in the crash-fault tolerant case to a dynamic system with continuous churn and nodes that can be Byzantine faulty. It is the first emulation of an atomic register in a system that can withstand nodes continually entering and leaving, imposes no upper bound on the system size and can tolerate Byzantine nodes. However, the number of Byzantine faulty nodes that can be tolerated is upper bounded by a constant number. Although the algorithm requires that there be a constant known upper bound on the number of Byzantine nodes, this restriction is unavoidable, as we show that it is impossible to emulate an atomic register if the system size and maximum number of servers that can be Byzantine in the system is unknown

    Revisiting Liveness Properties in the Context of Secure Systems

    Get PDF
    Distinguishing trace-based system properties into safety properties on the one hand and liveness properties on the other has proven very useful for specifying and validating concurrent and fault-tolerant systems. We study the adequacy of these abstractions, especially the liveness property abstraction, in the context of secure systems for two different scenarios: (1) Denial-of-service attacks and (2) brute-force attacks on secret keys. We argue that in both cases the concept of a liveness property needs to be adapted. We show how this can be done and relate the resulting concepts to related work in the areas of concurrency theory and fault-tolerance

    Achieving consensus in fault-tolerant distributed computer systems : protocols, lower bounds, and simulations

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1987.Vita.Bibliography: p. 145-149.by Brian A. Coan.Ph.D

    The notion of Timed Registers and its application to Indulgent Synchronization

    Get PDF
    A new type of shared object, called timed register, is proposed and used to design indulgent timing-based algorithms.A timed register generalizes the notion of an atomic register as follows: if a process invokes two consecutive operations on the same timed register which are a read followed by a write, then the write operation is executed only if it is invoked at most d time units after the read operation, where d is defined as part of the read operation. In this context, a timing-based algorithm is an algorithm whose correctness relies on the existence of a bound Δ\Delta such that any pair of consecutive constrained read and write operations issued by the same process on the same timed register are separated by at most Δ\Delta time units. An indulgent algorithm is an algorithm that always guarantees the safety properties, and ensures the liveness property as soon as the timing assumptions are satisfied. The usefulness of this new type of shared object is demonstrated by presenting simple and elegant indulgent timing-based algorithms that solve the mutual exclusion, \ell-exclusion, adaptive renaming, test&set, and consensus problems. Interestingly, timed registers are universal objects in systems with process crashes and transient timing failures (i.e., they allow building any concurrent object with a sequential specification). The paper also suggests connections with schedulers and contention managers

    Proactive resilience

    Get PDF
    Tese de doutoramento em Informática (Ciências da Computação), apresentada à Universidade de Lisboa através da Faculdade de Ciências, 2007Disponível no document

    Intrusion-Tolerant Middleware: the MAFTIA approach

    Get PDF
    The pervasive interconnection of systems all over the world has given computer services a significant socio-economic value, which can be affected both by accidental faults and by malicious activity. It would be appealing to address both problems in a seamless manner, through a common approach to security and dependability. This is the proposal of intrusion tolerance, where it is assumed that systems remain to some extent faulty and/or vulnerable and subject to attacks that can be successful, the idea being to ensure that the overall system nevertheless remains secure and operational. In this paper, we report some of the advances made in the European project MAFTIA, namely in what concerns a basis of concepts unifying security and dependability, and a modular and versatile architecture, featuring several intrusion-tolerant middleware building blocks. We describe new architectural constructs and algorithmic strategies, such as: the use of trusted components at several levels of abstraction; new randomization techniques; new replica control and access control algorithms. The paper concludes by exemplifying the construction of intrusion-tolerant applications on the MAFTIA middleware, through a transaction support servic

    Fault tolerant software technology for distributed computing system

    Get PDF
    Issued as Monthly reports [nos. 1-23], Interim technical report, Technical guide books [nos. 1-2], and Final report, Project no. G-36-64

    Support for dependable and adaptive distributed systems and applications

    Get PDF
    Tese de doutoramento, Informática (Engenharia Informática), Universidade de Lisboa, Faculdade de Ciências, 2011Distributed applications executing in uncertain environments, like the Internet, need to make timing/synchrony assumptions (for instance, about the maximum message transmission delay), in order to make progress. In the case of adaptive systems these temporal bounds should be computed at runtime, using probabilistic or specifically designed ad hoc approaches, typically with the objective of improving the application performance. From a dependability perspective, however, the concern is to secure some properties on which the application can rely. This thesis addresses the problem of supporting adaptive systems and applications in stochastic environments, from a dependability perspective: maintaining the correctness of system properties after adaptation. The idea behind dependable adaptation consists in ensuring that the assumed bounds for fundamental variables (e.g., network delays) are secured with a known and constant probability. Assuming that during its lifetime a system alternates periods where its temporal behavior is well characterized (stable phases), with transition periods where a variation of the network conditions occurs (transient phases), the proposed approach is based on the following: if the environment is generically characterized in analytical terms and it is possible to detect the alternation of these stable and transient phases, then it is possible to effectively and dependably adapt applications. Based on this idea, the thesis introduces Adaptare, a framework for supporting dependable adaptation in stochastic environments. An extensive evaluation of Adaptare is provided, assessing the correctness and effectiveness of the implemented mechanisms. The results indicate that the proposed strategies and methodologies are indeed effective to support dependable adaptation of distributed systems and applications. Finally, the applicability of Adaptare is evaluated in the context of two fundamental problems in distributed systems: consensus and failure detection. The thesis proposes solutions for these problems based on modular architectures in which Adaptare is used as a middleware for dependable adaptation of assumed timeouts.Aplicações distribuídas que executam em ambientes incertos, como a Internet, baseiam-se em pressupostos sobre tempo/sincronia (por exemplo, assumem um tempo máximo para a transmissão de mensagens) a fim de assegurar progresso. No caso de sistemas adaptativos, esses limites temporais devem ser calculados em tempo de execução, usando abordagens probabilísticas ou desenhadas de forma específica e ad hoc, tipicamente visando melhorar o desempenho da aplicação. Sob o ponto de vista da confiabilidade, no entanto, o objetivo é garantir algumas propriedades nas quais a aplicação pode confiar. Esta tese aborda o problema de suportar sistemas adaptativos e aplicações que operam em ambientes estocásticos, numa perspectiva de confiabilidade: mantendo a correção das propriedades do sistema após a adaptação. A ideia da adaptação confiável consiste em garantir que os limites assumidos para variáveis fundamentais (por exemplo, latências de transmissão) são assegurados com uma probabilidade conhecida e constante. Supondo que durante a execução o sistema alterna períodos nos quais o seu comportamento temporal é bem caracterizado (fases estáveis), com períodos de transição durante os quais ocorrem variações das condições da rede (fases transientes), a abordagem proposta baseia-se no seguinte: se o ambiente é genericamente caracterizado em termos analíticos e é possível detetar a alternância entre fases estáveis e transientes, então é possível adaptar as aplicações de forma efetiva e confiável. Com base nesta ideia, a tese apresenta uma plataforma para suportar a adaptação confiável em ambientes estocásticos, denominada Adaptare. A tese contém uma extensa avaliação do Adaptare, que foi realizada para verificar a correção e eficácia dos mecanismos desenvolvidos. Os resultados indicam que as estratégias e metodologias propostas são de facto efetivas para suportar a adaptação confiável de sistemas e aplicações distribuídas. Finalmente, a aplicabilidade do Adaptare é avaliada no contexto de dois problemas fundamentais em sistemas distribuídos: consenso e deteção de falhas. A tese propõe soluções para estes problemas baseadas em arquiteturas modulares nas quais o Adaptare é usado como um middleware para a adaptação confiável de timeouts.Fundação para a Ciência e a Tecnologia (FCT
    corecore