Search CORE

12 research outputs found

Scalable service-oriented replication with flexible consistency guarantee in the cloud

Author: Abdel-Rahman H. Tawil
Amjad
Andronikou
Bernstein
Birman
Chang
Coulouris
DeCandia
Défago
Ekwall
Hsiao
Lamport
Lamport
Li
Lloret
Mansouri
Osrael
Powell
Rami Bahsoon
Saltzer
Schneider
Sheng
Tao Chen
Villegas
Vogels
Wei
Zhang
Zhu
Publication venue: 'Elsevier BV'
Publication date: 28/11/2013
Field of study

Replication techniques are widely applied in and for cloud to improve scalability and availability. In such context, the well-understood problem is how to guarantee consistency amongst different replicas and govern the trade-off between consistency and scalability requirements. Such requirements are often related to specific services and can vary considerably in the cloud. However, a major drawback of existing service-oriented replication approaches is that they only allow either restricted consistency or none at all. Consequently, service-oriented systems based on such replication techniques may violate consistency requirements or not scale well. In this paper, we present a Scalable Service Oriented Replication (SSOR) solution, a middleware that is capable of satisfying applications’ consistency requirements when replicating cloud-based services. We introduce new formalism for describing services in service-oriented replication. We propose the notion of consistency regions and relevant service oriented requirements policies, by which trading between consistency and scalability requirements can be handled within regions. We solve the associated sub-problem of atomic broadcasting by introducing a Multi-fixed Sequencers Protocol (MSP), which is a requirements aware variation of the traditional fixed sequencer approach. We also present a Region-based Election Protocol (REP) that elastically balances the workload amongst sequencers. Finally, we experimentally evaluate our approach under different loads, to show that the proposed approach achieves better scalability with more flexible consistency constraints when compared with the state-of-the-art replication technique

Crossref

Birmingham City University Open Access Repository

University of Birmingham Research Portal

Nottingham Trent Institutional Repository (IRep)

BCU Open Access

Arquitetura de elevada disponibilidade para bases de dados na cloud

Author: Abreu Hugo Miguel Ferreira
Publication venue
Publication date: 08/11/2019
Field of study

Dissertação de mestrado em Computer ScienceCom a constante expansão de sistemas informáticos nas diferentes áreas de aplicação, a quantidade de dados que exigem persistência aumenta exponencialmente. Assim, por forma a tolerar faltas e garantir a disponibilidade de dados, devem ser implementadas técnicas de replicação. Atualmente existem várias abordagens e protocolos, tendo diferentes tipos de aplicações em vista. Existem duas grandes vertentes de protocolos de replicação, protocolos genéricos, para qualquer serviço, e protocolos específicos destinados a bases de dados. No que toca a protocolos de replicação genéricos, as principais técnicas existentes, apesar de completa mente desenvolvidas e em utilização, têm algumas limitações, nomeadamente: problemas de performance relativamente a saturação da réplica primária na replicação passiva e o determinismo necessário associado à replicação ativa. Algumas destas desvantagens são mitigadas pelos protocolos específicos de base de dados (e.g., com recurso a multi-master) mas estes protocolos não permitem efetuar uma separação entre a lógica da replicação e os respetivos dados. Abordagens mais recentes tendem a basear-se em técnicas de repli cação com fundamentos em mecanismos distribuídos de logging. Tais mecanismos propor cionam alta disponibilidade de dados e tolerância a faltas, permitindo abordagens inovado ras baseadas puramente em logs. Por forma a atenuar as limitações encontradas não só no mecanismo de replicação ativa e passiva, mas também nas suas derivações, esta dissertação apresenta uma solução de replicação híbrida baseada em middleware, o SQLware. A grande vantagem desta abor dagem baseia-se na divisão entre a camada de replicação e a camada de dados, utilizando um log distribuído altamente escalável que oferece tolerância a faltas e alta disponibilidade. O protótipo desenvolvido foi validado com recurso à execução de testes de desempenho, sendo avaliado em duas infraestruturas diferentes, nomeadamente, um servidor privado de média gama e um grupo de servidores de computação de alto desempenho. Durante a avaliação do protótipo, o standard da indústria TPC-C, tipicamente utilizado para avaliar sistemas de base de dados transacionais, foi utilizado. Os resultados obtidos demonstram que o SQLware oferece uma aumento de throughput de 150 vezes, comparativamente ao mecanismo de replicação nativo da base de dados considerada, o PostgreSQL.With the constant expansion of computational systems, the amount of data that requires durability increases exponentially. All data persistence must be replicated in order to provide high-availability and fault tolerance according to the surrogate application or use-case. Currently, there are numerous approaches and replication protocols developed supporting different use-cases. There are two prominent variations of replication protocols, generic protocols, and database specific ones. The two main techniques associated with generic replication protocols are the active and passive replication. Although generic replication techniques are fully matured and widely used, there are inherent problems associated with those protocols, namely: performance issues of the primary replica of passive replication and the determinism required by the active replication. Some of those disadvantages are mitigated by specific database replication protocols (e.g., using multi-master) but, those protocols do not allow a separation between logic and data and they can not be decoupled from the database engine. Moreover, recent strategies consider highly-scalable and fault tolerant distributed logging mechanisms, allowing for newer designs based purely on logs to power replication. To mitigate the shortcomings found in both active and passive replication mechanisms, but also in partial variations of these methods, this dissertation presents a hybrid replication middleware, SQLware. The cornerstone of the approach lies in the decoupling between the logical replication layer and the data store, together with the use of a highly scalable distributed log that provides fault-tolerance and high-availability. We validated the prototype by conducting a benchmarking campaign to evaluate the overall system’s performance under two distinct infrastructures, namely a private medium class server, and a private high performance computing cluster. Across the evaluation campaign, we considered the TPCC benchmark, a widely used benchmark in the evaluation of Online transaction processing (OLTP) database systems. Results show that SQLware was able to achieve 150 times more throughput when compared with the native replication mechanism of the underlying data store considered as baseline, PostgreSQL.This work was partially funded by FCT - Fundação para a Ciência e a Tecnologia, I.P., (Portuguese Foundation for Science and Technology) within project UID/EEA/50014/201

Universidade do Minho: RepositoriUM

Recommended from our members

Replicating multithreaded services

Author: Kapritsos Emmanouil
Publication venue
Publication date: 09/02/2015
Field of study

textFor the last 40 years, the systems community has invested a lot of effort in designing techniques for building fault tolerant distributed systems and services. This effort has produced a massive list of results: the literature describes how to design replication protocols that tolerate a wide range of failures (from simple crashes to malicious "Byzantine" failures) in a wide range of settings (e.g. synchronous or asynchronous communication, with or without stable storage), optimizing various metrics (e.g. number of messages, latency, throughput). These techniques have their roots in ideas, such as the abstraction of State Machine Replication and the Paxos protocol, that were conceived when computing was very different than it is today: computers had a single core; all processing was done using a single thread of control, handling requests sequentially; and a collection of 20 nodes was considered a large distributed system. In the last decade, however, computing has gone through some major paradigm shifts, with the advent of multicore architectures and large cloud infrastructures. This dissertation explains how these profound changes impact the practical usefulness of traditional fault tolerant techniques and proposes new ways to architect these solutions to fit the new paradigms.Computer Science

Texas ScholarWorks

Practical database replication

Author: Correia Júnior Alfrânio Tavares
Publication venue
Publication date: 16/12/2010
Field of study

Tese de doutoramento em InformáticaSoftware-based replication is a cost-effective approach for fault-tolerance when combined with commodity hardware. In particular, shared-nothing database clusters built upon commodity machines and synchronized through eager software-based replication protocols have been driven by the distributed systems community in the last decade. The efforts on eager database replication, however, stem from the late 1970s with initial proposals designed by the database community. From that time, we have the distributed locking and atomic commitment protocols. Briefly speaking, before updating a data item, all copies are locked through a distributed lock, and upon commit, an atomic commitment protocol is responsible for guaranteeing that the transaction’s changes are written to a non-volatile storage at all replicas before committing it. Both these processes contributed to a poor performance. The distributed systems community improved these processes by reducing the number of interactions among replicas through the use of group communication and by relaxing the durability requirements imposed by the atomic commitment protocol. The approach requires at most two interactions among replicas and disseminates updates without necessarily applying them before committing a transaction. This relies on a high number of machines to reduce the likelihood of failures and ensure data resilience. Clearly, the availability of commodity machines and their increasing processing power makes this feasible. Proving the feasibility of this approach requires us to build several prototypes and evaluate them with different workloads and scenarios. Although simulation environments are a good starting point, mainly those that allow us to combine real (e.g., replication protocols, group communication) and simulated-code (e.g., database, network), full-fledged implementations should be developed and tested. Unfortunately, database vendors usually do not provide native support for the development of third-party replication protocols, thus forcing protocol developers to either change the database engines, when the source code is available, or construct in the middleware server wrappers that intercept client requests otherwise. The former solution is hard to maintain as new database releases are constantly being produced, whereas the latter represents a strenuous development effort as it requires us to rebuild several database features at the middleware. Unfortunately, the group-based replication protocols, optimistic or conservative, that had been proposed so far have drawbacks that present a major hurdle to their practicability. The optimistic protocols make it difficult to commit transactions in the presence of hot-spots, whereas the conservative protocols have a poor performance due to concurrency issues. In this thesis, we propose using a generic architecture and programming interface, titled GAPI, to facilitate the development of different replication strategies. The idea consists of providing key extensions to multiple DBMSs (Database Management Systems), thus enabling a replication strategy to be developed once and tested on several databases that have such extensions, i.e., those that are replication-friendly. To tackle the aforementioned problems in groupbased replication protocols, we propose using a novel protocol, titled AKARA. AKARA guarantees fairness, and thus all transactions have a chance to commit, and ensures great performance while exploiting parallelism as provided by local database engines. Finally, we outline a simple but comprehensive set of components to build group-based replication protocols and discuss key points in its design and implementation.A replicação baseada em software é uma abordagem que fornece um bom custo benefício para tolerância a falhas quando combinada com hardware commodity. Em particular, os clusters de base de dados “shared-nothing” construídos com hardware commodity e sincronizados através de protocolos “eager” têm sido impulsionados pela comunidade de sistemas distribuídos na última década. Os primeiros esforços na utilização dos protocolos “eager”, decorrem da década de 70 do século XX com as propostas da comunidade de base de dados. Dessa época, temos os protocolos de bloqueio distribuído e de terminação atómica (i.e. “two-phase commit”). De forma sucinta, antes de actualizar um item de dados, todas as cópias são bloqueadas através de um protocolo de bloqueio distribuído e, no momento de efetivar uma transacção, um protocolo de terminação atómica é responsável por garantir que as alterações da transacção são gravadas em todas as réplicas num sistema de armazenamento não-volátil. No entanto, ambos os processos contribuem para um mau desempenho do sistema. A comunidade de sistemas distribuídos melhorou esses processos, reduzindo o número de interacções entre réplicas, através do uso da comunicação em grupo e minimizando a rigidez os requisitos de durabilidade impostos pelo protocolo de terminação atómica. Essa abordagem requer no máximo duas interacções entre as réplicas e dissemina actualizações sem necessariamente aplicá-las antes de efectivar uma transacção. Para funcionar, a solução depende de um elevado número de máquinas para reduzirem a probabilidade de falhas e garantir a resiliência de dados. Claramente, a disponibilidade de hardware commodity e o seu poder de processamento crescente tornam essa abordagem possível. Comprovar a viabilidade desta abordagem obriga-nos a construir vários protótipos e a avaliálos com diferentes cargas de trabalho e cenários. Embora os ambientes de simulação sejam um bom ponto de partida, principalmente aqueles que nos permitem combinar o código real (por exemplo, protocolos de replicação, a comunicação em grupo) e o simulado (por exemplo, base de dados, rede), implementações reais devem ser desenvolvidas e testadas. Infelizmente, os fornecedores de base de dados, geralmente, não possuem suporte nativo para o desenvolvimento de protocolos de replicação de terceiros, forçando os desenvolvedores de protocolo a mudar o motor de base de dados, quando o código fonte está disponível, ou a construir no middleware abordagens que interceptam as solicitações do cliente. A primeira solução é difícil de manter já que novas “releases” das bases de dados estão constantemente a serem produzidas, enquanto a segunda representa um desenvolvimento árduo, pois obriga-nos a reconstruir vários recursos de uma base de dados no middleware. Infelizmente, os protocolos de replicação baseados em comunicação em grupo, optimistas ou conservadores, que foram propostos até agora apresentam inconvenientes que são um grande obstáculo à sua utilização. Com os protocolos optimistas é difícil efectivar transacções na presença de “hot-spots”, enquanto que os protocolos conservadores têm um fraco desempenho devido a problemas de concorrência. Nesta tese, propomos utilizar uma arquitetura genérica e uma interface de programação, intitulada GAPI, para facilitar o desenvolvimento de diferentes estratégias de replicação. A ideia consiste em fornecer extensões chaves para múltiplos SGBDs (Database Management Systems), permitindo assim que uma estratégia de replicação possa ser desenvolvida uma única vez e testada em várias bases de dados que possuam tais extensões, ou seja, aquelas que são “replicationfriendly”. Para resolver os problemas acima referidos nos protocolos de replicação baseados em comunicação em grupo, propomos utilizar um novo protocolo, intitulado AKARA. AKARA garante a equidade, portanto, todas as operações têm uma oportunidade de serem efectivadas, e garante um excelente desempenho ao tirar partido do paralelismo fornecido pelos motores de base de dados. Finalmente, propomos um conjunto simples, mas abrangente de componentes para construir protocolos de replicação baseados em comunicação em grupo e discutimos pontoschave na sua concepção e implementação

Universidade do Minho: RepositoriUM

A scalable decentralized group membership service for an asynchronous environment

Author: Neely David S.
Publication venue: Monterey, California. Naval Postgraduate School
Publication date: 01/06/1994
Field of study

This thesis presents a globally scalable, decentralized group membership service to manage client process groups operating in a distributed, asynchronous environment. This group membership service is totally scalable, handling process groups spanning a single LAN to groups spanning the entire global Internet equally well. It provides for nested and overlapping groups, as well as multiple groups residing on a single LAN. It also provides various Quality of Service selections which permit individual groups to be configured for an optimal balance between high quality with strong consistency semantics for group membership, and weaker consistency semantics with reduced complexity and latency. This thesis describes the complete design of the protocol used to implement the group membership service. It presents the design requirements and goals, and underlying assumptions about the network. The various Quality of Service selections provided by the group membership service are described in detail, as well as the interface between the process groups, the membership service, and the underlying network. The use of a hierarchical architecture to obtain the desired scalability, flexibility, and robustness is explained. A proof of correctness for the protocol is presented, and a partial implementation of the group membership service is describedhttp://archive.org/details/scalabledecentra00neelLieutenant, United States NavyApproved for public release; distribution is unlimited

Calhoun, Institutional Archive of the Naval Postgraduate School

Group communications and database replication:techniques, issues and performance

Author: Wiesmann Matthias
Publication venue: Lausanne, EPFL
Publication date: 16/03/2005
Field of study

Databases are an important part of today's IT infrastructure: both companies and state institutions rely on database systems to store most of their important data. As we are more and more dependent on database systems, securing this key facility is now a priority. Because of this, research on fault-tolerant database systems is of increasing importance. One way to ensure the fault-tolerance of a system is by replicating it. Replication is a natural way to deal with failures: if one copy is not available, we use another one. However implementing consistent replication is not easy. Database replication is hardly a new area of research: the first papers on the subject are more than twenty years old. Yet how to build an efficient, consistent replicated database is still an open research question. Recently, a new approach to solve this problem has been proposed. The idea is to rely on some communication infrastructure called group communications. This infrastructure offers some high-level primitives that can help in the design and the implementation of a replicated database. While promising, this approach to database replication is still in its infancy. This thesis focuses on group communication-based database replication and strives to give an overall understanding of this topic. This thesis has three major contributions. In the structural domain, it introduces a classification of replication techniques. In the qualitative domain, an analysis of fault-tolerance semantics is proposed. Finally, in the quantitative domain, a performance evaluation of group communication-based database replication is presented. The classification gives an overview of the different means to implement database replication. Techniques described in the literature are sorted using this classification. The classification highlights structural similarities of techniques originating from different communities (database community and distributed system community). For each category of the classification, we also analyse the requirements imposed on the database component and group communication primitives that are needed to enforce consistency. Group communication-based database replication implies building a system from two different components: a database system and a group communication system. Fault-tolerance is an end-to-end property: a system built from two components tends to be as fault-tolerant as the weakest component. The analysis of fault-tolerance semantics show what fault-tolerance guarantee is ensured by group communication based replication techniques. Additionally a new faulttolerance guarantee, group-safety, is proposed. Group-safety is better suited to group communication-based database replication. We also show that group-safe replication techniques can offer improved performance. Finally, the performance evaluation offers a quantitative view of group communication based replication techniques. The performance of group communication techniques and classical database replication techniques is compared. The way those different techniques react to different loads is explored. Some optimisation of group communication techniques are also described and their performance benefits evaluated

Infoscience - École polytechnique fédérale de Lausanne

Operating system fault tolerance support for real-time embedded applications

Author: Afonso Francisco
Publication venue
Publication date: 15/06/2009
Field of study

Tese de doutoramento em Electrónica Industrial (ramo de conhecimento em Informática Industrial)Fault tolerance is a means of achieving high dependability for critical and highavailability systems. Despite the efforts to prevent and remove faults during the development of these systems, the application of fault tolerance is usually required because the hardware may fail during system operation and software faults are very hard to eliminate completely. One of the difficulties in implementing fault tolerance techniques is the lack of support from operating systems and middleware. In most fault tolerant projects, the programmer has to develop a fault tolerance implementation for each application. This strong customization makes the fault-tolerant software costly and difficult to implement and maintain. In particular, for small-scale embedded systems, the introduction of fault tolerance techniques may also have impact on their restricted resources, such as processing power and memory size. The purpose of this research is to provide fault tolerance support for real-time applications in small-scale embedded systems. The main approach of this thesis is to develop and integrate a customizable and extendable fault tolerance framework into a real-time operating system, in order to fulfill the needs of a large range of dependable applications. Special attention is taken to allow the coexistence of fault tolerance with real-time constraints. The utilization of the proposed framework features several advantages over ad-hoc implementations, such as simplifying application-level programming and improving the system configurability and maintainability. In addition, this thesis also investigates the application of aspect-oriented techniques to the development of real-time embedded fault-tolerant software. Aspect- Oriented Programming (AOP) is employed to modularize all fault tolerant source code, following the principle of separation of concerns, and to integrate the proposed framework into the operating system. Two case studies are used to evaluate the proposed implementation in terms of performance and resource costs. The results show that the overheads related to the framework application are acceptable and the ones related to the AOP implementation are negligible.Tolerância a falhas é um meio de obter-se alta confiabilidade para sistemas críticos e de elevada disponibilidade. Apesar dos esforços para prevenir e remover falhas durante o desenvolvimento destes sistemas, a aplicação de tolerância a falhas é normalmente necessária, já que o hardware pode falhar durante a operação do sistema e falhas de software são muito difíceis de eliminar completamente. Uma das dificuldades na implementação de técnicas de tolerância a falhas é a falta de suporte por parte dos sistemas operativos e middleware. Na maioria dos projectos tolerantes a falhas, o programador deve desenvolver uma implementação de tolerância a falhas para cada aplicação. Esta elevada adaptação torna o software tolerante a falhas dispendioso e difícil de implementar e manter. Em particular, para sistemas embebidos de pequena escala, a introdução de técnicas de tolerância a falhas pode também ter impacto nos seus restritos recursos, tais como capacidade de processamento e tamanho da memória. O propósito desta tese é prover suporte à tolerância a falhas para aplicações de tempo real em sistemas embebidos de pequena escala. A principal abordagem utilizada nesta tese foi desenvolver e integrar uma framework tolerante a falhas, customizável e extensível, a um sistema operativo de tempo real, a fim de satisfazer às necessidades de uma larga gama de aplicações confiáveis. Especial atenção foi dada para permitir a coexistência de tolerância a falhas com restrições de tempo real. A utilização da framework proposta apresenta diversas vantagens sobre implementações ad-hoc, tais como simplificar a programação a nível da aplicação e melhorar a configurabilidade e a facilidade de manutenção do sistema. Além disto, esta tese também investiga a aplicação de técnicas orientadas a aspectos no desenvolvimento de software tolerante a falhas, embebido e de tempo real. A Programação Orientada a Aspectos (POA) é empregada para segregar em módulos isolados todo o código fonte tolerante a falhas, seguindo o princípio da separação de interesses, e para integrar a framework proposta com o sistema operativo. Dois casos de estudo são utilizados para avaliar a implementação proposta em termos de desempenho e utilização de recursos. Os resultados mostram que os acréscimos de recursos relativos à aplicação da framework são aceitáveis e os relativos à implementação POA são insignificantes

Universidade do Minho: RepositoriUM

Replicated execution of workflows

Author: Schäfer David Richard
Publication venue
Publication date: 01/01/2018
Field of study

Workflows are the de facto standard for managing and optimizing business processes. Workflows allow businesses to automate interactions between business locations and partners residing anywhere on the planet. This, however, requires the workflows to be executed in a distributed and dynamic environment, where device and communication failures occur quite frequently. In case that a workflow execution becomes unavailable through such failures, the business operations that rely on the workflow might be hindered or even stopped, implying the loss of money. Consequently, availability is a key concern when using workflows in dynamic environments. In this thesis, we propose replication schemes for workflow engines to ensure the availability of the workflows that are executed by these engines. Of course, a workflow that is executed by a replicated workflow engine has to yield the same result as a non-replicated execution of that workflow. To this end, we formally define the equivalence of a replicated and a non-replicated execution called Single-Execution-Equivalence. Subsequently, we present replication schemes for both imperative and declarative workflow languages. Imperative workflow languages, such as the Web Service Business Process Execution Language (WS-BPEL), specify the execution order of activities through an ordering relation and are the predominant way of specifying workflow models. We implement a proof-of-concept for demonstrating the compatibility of our replication schemes with current (imperative) workflow technology. Declarative workflow languages provide greater flexibility by allowing the reordering of the activities within a workflow at run-time. We exploit this by executing differently ordered replicas on several nodes in the network for improving availability further

Replication of non-deterministic objects

Author: Wolf Thomas
Publication venue: Lausanne, EPFL
Publication date: 16/03/2005
Field of study

This thesis discusses replication of non-deterministic objects in distributed systems to achieve fault tolerance against crash failures. The objects replicated are the virtual nodes of a distributed application. Replication is viewed as an issue that is to be dealt with only during the configuration of a distributed application and that should not affect the development of the application. Hence, replication of virtual nodes should be transparent to the application. Like all measures to achieve fault tolerance, replication introduces redundancy in the system. Not surprisingly, the main difficulty is guaranteeing the consistency of all replicas such that they behave in the same way as if the object was not replicated (replication transparency). This is further complicated if active objects (like virtual nodes) are replicated, and these objects themselves can be clients of still further objects in the distributed application. The problems of replication of active non-deterministic objects are analyzed in the context of distributed Ada 95 applications. The ISO standard for Ada 95 defines a model for distributed execution based on remote procedure calls (RPC). Virtual nodes in Ada 95 use this as their sole communication paradigm, but they may contain tasks to execute activities concurrently, thus making the execution potentially non-deterministic due to implicit timing dependencies. Such non-determinism cannot be avoided by choosing deterministic tasking policies. I present two different approaches to maintain replica consistency despite this non-determinism. In a first approach, I consider the run-time support of Ada 95 as a black box (except for the part handling remote communications). This corresponds to a non-deterministic computation model. I show that replication of non-deterministic virtual nodes requires that remote procedure calls are implemented as nested transactions. Unfortunately, effects of failures are not local to the replicas of a virtual node: when a failure occurs, nested remote calls made to other virtual nodes must be undone. Also, using transactional semantics for RPCs necessitates a compromise regarding transparency: the application must identify global state for it cannot be determined reliably in an automatic way. Further study reveals that this approach cannot be implemented in a transparent way at all because the consistency criterion of Ada 95 (linearizability) is much weaker than that of transactions (serializability). An execution of remote procedure calls as transactions may thus lead to incompatibilities with the semantics of the programming language. If remotely called subprograms on a replicated virtual node perform partial operations, i.e., entry calls on global protected objects, deadlocks that cannot be broken can occur in certain cases. Such deadlocks do not occur when the virtual node is not replicated. The transactional semantics of RPCs must therefore be exposed to the application. A second approach is based on a piecewise deterministic computation model, i.e., the execution of a virtual node is seen as a sequence of deterministic state intervals. Whenever a non-deterministic event occurs, a new state interval is started. I study replica organization under this computation model (semi-active replication). In this model, all non-deterministic decisions are made on one distinguished replica (the leader), while all other replicas (the followers) are forced to follow the same sequence of non-deterministic events. I show that it suffices to synchronize the followers with the leader upon each observable event, i.e., when the leader sends a message to some other virtual node. It is not necessary to synchronize upon each and every non-deterministic event — which would incur a prohibitively high overhead. Non-deterministic events occurring on the leader between observable events are logged and sent to the followers just before the leader executes an observable event. Consequently, it is guaranteed that the followers will reach the same state as the leader, and thus the effects of failures remain mostly local to the replicas. A prototype implementation called RAPIDS (Replicated Ada Partitions In Distributed Systems) serves as a proof of concept for this second approach, demonstrating its feasibility. RAPIDS is an Ada 95 implementation of a replication manager for semi-active replication for the GNAT development system for Ada 95. It is entirely contained within the run-time support and hence largely transparent for the application

Infoscience - École polytechnique fédérale de Lausanne

Fault-tolerant software: dependability/performance trade-offs, concurrency and system support

Author: Xu Jie
Publication venue: Newcastle University
Publication date: 01/01/1999
Field of study

PhD ThesisAs the use of computer systems becomes more and more widespread in applications that demand high levels of dependability, these applications themselves are growing in complexity in a rapid rate, especially in the areas that require concurrent and distributed computing. Such complex systems are very prone to faults and errors. No matter how rigorously fault avoidance and fault removal techniques are applied, software design faults often remain in systems when they are delivered to the customers. In fact, residual software faults are becoming the significant underlying cause of system failures and the lack of dependability. There is tremendous need for systematic techniques for building dependable software, including the fault tolerance techniques that ensure software-based systems to operate dependably even when potential faults are present. However, although there has been a large amount of research in the area of fault-tolerant software, existing techniques are not yet sufficiently mature as a practical engineering discipline for realistic applications. In particular, they are often inadequate when applied to highly concurrent and distributed software. This thesis develops new techniques for building fault-tolerant software, addresses the problem of achieving high levels of dependability in concurrent and distributed object systems, and studies system-level support for implementing dependable software. Two schemes are developed - the t/(n-l)-VP approach is aimed at increasing software reliability and controlling additional complexity, while the SCOP approach presents an adaptive way of dynamically adjusting software reliability and efficiency aspects. As a more general framework for constructing dependable concurrent and distributed software, the Coordinated Atomic (CA) Action scheme is examined thoroughly. Key properties of CA actions are formalized, conceptual model and mechanisms for handling application level exceptions are devised, and object-based diversity techniques are introduced to cope with potential software faults. These three schemes are evaluated analytically and validated by controlled experiments. System-level support is also addressed with a multi-level system architecture. An architectural pattern for implementing fault-tolerant objects is documented in detail to capture existing solutions and our previous experience. An industrial safety-critical application, the Fault-Tolerant Production Cell, is used as a case study to examine most of the concepts and techniques developed in this research.ESPRIT

Newcastle University eTheses