3 research outputs found

    On Statistically Estimated Optimistic Delivery inWide-Area Total Order Protocols

    Get PDF
    Total order broadcast protocols have been successfully applied as the basis for the construction of many fault-tolerant distributed systems. Unfortunately, the implementation of such a primitive can be expensive both in terms of communication steps and of number of messages exchanged. To alleviate this problem, optimistic total order protocols have been proposed. This paper addresses the problem of offering optimistic total order in geographically wide-area systems. We present a protocol that outperforms previous work, by minimizing the average latency of the optimistic notificatio

    Totally Ordered Broadcast and Multicast Algorithms: A Comprehensive Survey

    Get PDF
    Total order multicast algorithms constitute an important class of problems in distributed systems, especially in the context of fault-tolerance. In short, the problem of total order multicast consists in sending messages to a set of processes, in such a way that all messages are delivered by all correct destinations in the same order. However, the huge amount of literature on the subject and the plethora of solutions proposed so far make it difficult for practitioners to select a solution adapted to their specific problem. As a result, naive solutions are often used while better solutions are ignored. This paper proposes a classification of total order multicast algorithms based on the ordering mechanism of the algorithms, and describes a set of common characteristics (e.g., assumptions, properties) with which to evaluate them. In this classification, more than fifty total order broadcast and multicast algorithms are surveyed. The presentation includes asynchronous algorithms as well as algorithms based on the more restrictive synchronous model. Fault-tolerance issues are also considered as the paper studies the properties and behavior of the different algorithms with respect to failures

    Tolerância a faltas bizantinas através de hibridização do sistema distribuído

    Get PDF
    Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Ciência da Computação, Florianópolis, 2013.A ocorrência de faltas e falhas nos sistemas computacionais pode levar a catástrofes e prejuízos humanos, estruturais e financeiros. Recentemente, as faltas em sistemas computacionais têm aparecido mais frequentemente sob a forma de intrusões, que são o resultado de um ataque que obtém sucesso ao explorar uma ou mais vulnerabilidades. Uma questão recorrente é a discussão de quanto podemos confiar no funcionamento destes sistemas, demonstrando a necessidade de uma melhor aplicação de conceitos como dependabilidade, onde é esperado que o sistema funcione conforme suas especificações, ainda que alguns componentes apresentem problemas. Replicação de Máquina de Estados é uma técnica comumente utilizada na implementação de serviços distribuídos que toleram faltas e intrusões. Originalmente as abordagens baseadas nesta técnica necessitavam 3f + 1 servidores para tolerar f faltas. Recentemente, através do uso de modelos híbridos, que possuem componentes confiáveis, algumas abordagens conseguiram reduzir este número para 2f + 1. Para construir estes componentes confiáveis é necessário fazer algumas modificações complexas nos servidores, tanto do ponto de vista de software quanto de hardware. A arquitetura de sistema proposta neste trabalho é baseada em um modelo, chamado de modelo híbrido, em que as suposições de sincronismo, presença e severidade de faltas e falhas variam de componente para componente. O modelo aqui proposto utiliza uma abstração de compartilhamento de dados - os Registradores Compartilhados Distribuídos - e explora o uso de tecnologias de virtualização para simplificar a criação da componente inviolável de tolerância a faltas. Com esta arquitetura é possível diminuir a quantidade de recursos computacionais necessários de 3f + 1 para 2f + 1, além de alcançar uma latência (em números de passos para comunicação) comparável apenas com algoritmos especulativos.Abstract : The occurrence of faults and failures in computer systems can lead to disastersand damages in human, structural and financial meanings. Recently,faults in computer systems have appeared most often in the form of intrusions,which are the result of an attack that succeeds by exploiting one ormore vulnerabilities. A recurrent issue is the discussion of how much we cantrust in the execution of these systems, demonstrating the need for better implementationof concepts such as dependability, where it is expected that thesystem works according to their specifications, although some componentshave problems. State Machine Replication is a technique commonly usedin the implementation of distributed services that tolerate faults and intrusions.Originally approaches based on this technique needed 3 f +1 servers totolerate f faults. Recently, through the use of hybrid models that have reliablecomponents, some approaches have succeeded in reducing this numberto 2 f +1. To build these reliable components is necessary to make somecomplex modifications in the servers, in meanings of software and hardware.The system architecture proposed in this work is based on a hybrid model, inwhich the assumptions of timing, presence and severity of faults and failuresvary from component to component. The proposed model uses an abstractionof data sharing - Distributed Shared Registers - and explores the use of virtualizationtechnologies to simplify the creation of the fault tolerant tamperproofcomponent. With this architecture it is possible to reduce the amount of computationalresources needed from 3 f +1 to 2 f+ 1, and achieve a latency(in terms of number of communication steps) comparable only to speculativealgorithms
    corecore