12 research outputs found
Scalable service-oriented replication with flexible consistency guarantee in the cloud
Replication techniques are widely applied in and for cloud to improve scalability and availability. In such context, the well-understood problem is how to guarantee consistency amongst different replicas and govern the trade-off between consistency and scalability requirements. Such requirements are often related to specific services and can vary considerably in the cloud. However, a major drawback of existing service-oriented replication approaches is that they only allow either restricted consistency or none at all. Consequently, service-oriented systems based on such replication techniques may violate consistency requirements or not scale well. In this paper, we present a Scalable Service Oriented Replication (SSOR) solution, a middleware that is capable of satisfying applications’ consistency requirements when replicating cloud-based services. We introduce new formalism for describing services in service-oriented replication. We propose the notion of consistency regions and relevant service oriented requirements policies, by which trading between consistency and scalability requirements can be handled within regions. We solve the associated sub-problem of atomic broadcasting by introducing a Multi-fixed Sequencers Protocol (MSP), which is a requirements aware variation of the traditional fixed sequencer approach. We also present a Region-based Election Protocol (REP) that elastically balances the workload amongst sequencers. Finally, we experimentally evaluate our approach under different loads, to show that the proposed approach achieves better scalability with more flexible consistency constraints when compared with the state-of-the-art replication technique
Arquitetura de elevada disponibilidade para bases de dados na cloud
Dissertação de mestrado em Computer ScienceCom a constante expansão de sistemas informáticos nas diferentes áreas de aplicação, a
quantidade de dados que exigem persistência aumenta exponencialmente. Assim, por
forma a tolerar faltas e garantir a disponibilidade de dados, devem ser implementadas
técnicas de replicação.
Atualmente existem várias abordagens e protocolos, tendo diferentes tipos de aplicações
em vista. Existem duas grandes vertentes de protocolos de replicação, protocolos genéricos,
para qualquer serviço, e protocolos específicos destinados a bases de dados. No que toca
a protocolos de replicação genéricos, as principais técnicas existentes, apesar de completa mente desenvolvidas e em utilização, têm algumas limitações, nomeadamente: problemas
de performance relativamente a saturação da réplica primária na replicação passiva e o
determinismo necessário associado à replicação ativa. Algumas destas desvantagens são
mitigadas pelos protocolos específicos de base de dados (e.g., com recurso a multi-master)
mas estes protocolos não permitem efetuar uma separação entre a lógica da replicação e
os respetivos dados. Abordagens mais recentes tendem a basear-se em técnicas de repli cação com fundamentos em mecanismos distribuídos de logging. Tais mecanismos propor cionam alta disponibilidade de dados e tolerância a faltas, permitindo abordagens inovado ras baseadas puramente em logs.
Por forma a atenuar as limitações encontradas não só no mecanismo de replicação ativa
e passiva, mas também nas suas derivações, esta dissertação apresenta uma solução de
replicação híbrida baseada em middleware, o SQLware. A grande vantagem desta abor dagem baseia-se na divisão entre a camada de replicação e a camada de dados, utilizando
um log distribuído altamente escalável que oferece tolerância a faltas e alta disponibilidade.
O protótipo desenvolvido foi validado com recurso à execução de testes de desempenho,
sendo avaliado em duas infraestruturas diferentes, nomeadamente, um servidor privado
de média gama e um grupo de servidores de computação de alto desempenho. Durante a
avaliação do protótipo, o standard da indústria TPC-C, tipicamente utilizado para avaliar
sistemas de base de dados transacionais, foi utilizado. Os resultados obtidos demonstram
que o SQLware oferece uma aumento de throughput de 150 vezes, comparativamente ao
mecanismo de replicação nativo da base de dados considerada, o PostgreSQL.With the constant expansion of computational systems, the amount of data that requires
durability increases exponentially. All data persistence must be replicated in order to provide high-availability and fault tolerance according to the surrogate application or use-case.
Currently, there are numerous approaches and replication protocols developed supporting different use-cases. There are two prominent variations of replication protocols, generic
protocols, and database specific ones. The two main techniques associated with generic
replication protocols are the active and passive replication. Although generic replication
techniques are fully matured and widely used, there are inherent problems associated with
those protocols, namely: performance issues of the primary replica of passive replication
and the determinism required by the active replication. Some of those disadvantages are
mitigated by specific database replication protocols (e.g., using multi-master) but, those
protocols do not allow a separation between logic and data and they can not be decoupled
from the database engine. Moreover, recent strategies consider highly-scalable and fault tolerant distributed logging mechanisms, allowing for newer designs based purely on logs
to power replication.
To mitigate the shortcomings found in both active and passive replication mechanisms,
but also in partial variations of these methods, this dissertation presents a hybrid replication middleware, SQLware. The cornerstone of the approach lies in the decoupling between
the logical replication layer and the data store, together with the use of a highly scalable distributed log that provides fault-tolerance and high-availability. We validated the prototype
by conducting a benchmarking campaign to evaluate the overall system’s performance under two distinct infrastructures, namely a private medium class server, and a private high
performance computing cluster. Across the evaluation campaign, we considered the TPCC benchmark, a widely used benchmark in the evaluation of Online transaction processing
(OLTP) database systems. Results show that SQLware was able to achieve 150 times more
throughput when compared with the native replication mechanism of the underlying data
store considered as baseline, PostgreSQL.This work was partially funded by FCT - Fundação para a Ciência e a Tecnologia, I.P.,
(Portuguese Foundation for Science and Technology) within project UID/EEA/50014/201
Recommended from our members
Replicating multithreaded services
textFor the last 40 years, the systems community has invested a lot of effort in designing techniques for building fault tolerant distributed systems and services. This effort has produced a massive list of results: the literature describes how to design replication protocols that tolerate a wide range of failures (from simple crashes to malicious "Byzantine" failures) in a wide range of settings (e.g. synchronous or asynchronous communication, with or without stable storage), optimizing various metrics (e.g. number of messages, latency, throughput). These techniques have their roots in ideas, such as the abstraction of State Machine Replication and the Paxos protocol, that were conceived when computing was very different than it is today: computers had a single core; all processing was done using a single thread of control, handling requests sequentially; and a collection of 20 nodes was considered a large distributed system. In the last decade, however, computing has gone through some major paradigm shifts, with the advent of multicore architectures and large cloud infrastructures. This dissertation explains how these profound changes impact the practical usefulness of traditional fault tolerant techniques and proposes new ways to architect these solutions to fit the new paradigms.Computer Science
Practical database replication
Tese de doutoramento em InformáticaSoftware-based replication is a cost-effective approach for fault-tolerance when combined with
commodity hardware. In particular, shared-nothing database clusters built upon commodity machines
and synchronized through eager software-based replication protocols have been driven by
the distributed systems community in the last decade.
The efforts on eager database replication, however, stem from the late 1970s with initial
proposals designed by the database community. From that time, we have the distributed locking
and atomic commitment protocols. Briefly speaking, before updating a data item, all copies
are locked through a distributed lock, and upon commit, an atomic commitment protocol is
responsible for guaranteeing that the transaction’s changes are written to a non-volatile storage
at all replicas before committing it. Both these processes contributed to a poor performance.
The distributed systems community improved these processes by reducing the number of interactions
among replicas through the use of group communication and by relaxing the durability
requirements imposed by the atomic commitment protocol. The approach requires at most two
interactions among replicas and disseminates updates without necessarily applying them before
committing a transaction. This relies on a high number of machines to reduce the likelihood of
failures and ensure data resilience. Clearly, the availability of commodity machines and their
increasing processing power makes this feasible.
Proving the feasibility of this approach requires us to build several prototypes and evaluate
them with different workloads and scenarios. Although simulation environments are a good starting
point, mainly those that allow us to combine real (e.g., replication protocols, group communication)
and simulated-code (e.g., database, network), full-fledged implementations should be
developed and tested. Unfortunately, database vendors usually do not provide native support for
the development of third-party replication protocols, thus forcing protocol developers to either
change the database engines, when the source code is available, or construct in the middleware
server wrappers that intercept client requests otherwise. The former solution is hard to maintain
as new database releases are constantly being produced, whereas the latter represents a strenuous
development effort as it requires us to rebuild several database features at the middleware.
Unfortunately, the group-based replication protocols, optimistic or conservative, that had
been proposed so far have drawbacks that present a major hurdle to their practicability. The
optimistic protocols make it difficult to commit transactions in the presence of hot-spots, whereas
the conservative protocols have a poor performance due to concurrency issues.
In this thesis, we propose using a generic architecture and programming interface, titled
GAPI, to facilitate the development of different replication strategies. The idea consists of providing key extensions to multiple DBMSs (Database Management Systems), thus enabling a
replication strategy to be developed once and tested on several databases that have such extensions,
i.e., those that are replication-friendly. To tackle the aforementioned problems in groupbased
replication protocols, we propose using a novel protocol, titled AKARA. AKARA guarantees
fairness, and thus all transactions have a chance to commit, and ensures great performance
while exploiting parallelism as provided by local database engines. Finally, we outline a simple
but comprehensive set of components to build group-based replication protocols and discuss key
points in its design and implementation.A replicação baseada em software é uma abordagem que fornece um bom custo benefício para
tolerância a falhas quando combinada com hardware commodity. Em particular, os clusters de
base de dados “shared-nothing” construídos com hardware commodity e sincronizados através de
protocolos “eager” têm sido impulsionados pela comunidade de sistemas distribuídos na última
década.
Os primeiros esforços na utilização dos protocolos “eager”, decorrem da década de 70 do
século XX com as propostas da comunidade de base de dados. Dessa época, temos os protocolos
de bloqueio distribuído e de terminação atómica (i.e. “two-phase commit”). De forma sucinta,
antes de actualizar um item de dados, todas as cópias são bloqueadas através de um protocolo
de bloqueio distribuído e, no momento de efetivar uma transacção, um protocolo de terminação
atómica é responsável por garantir que as alterações da transacção são gravadas em todas as
réplicas num sistema de armazenamento não-volátil. No entanto, ambos os processos contribuem
para um mau desempenho do sistema.
A comunidade de sistemas distribuídos melhorou esses processos, reduzindo o número de
interacções entre réplicas, através do uso da comunicação em grupo e minimizando a rigidez
os requisitos de durabilidade impostos pelo protocolo de terminação atómica. Essa abordagem
requer no máximo duas interacções entre as réplicas e dissemina actualizações sem necessariamente
aplicá-las antes de efectivar uma transacção. Para funcionar, a solução depende de um
elevado número de máquinas para reduzirem a probabilidade de falhas e garantir a resiliência de
dados. Claramente, a disponibilidade de hardware commodity e o seu poder de processamento
crescente tornam essa abordagem possível.
Comprovar a viabilidade desta abordagem obriga-nos a construir vários protótipos e a avaliálos
com diferentes cargas de trabalho e cenários. Embora os ambientes de simulação sejam um
bom ponto de partida, principalmente aqueles que nos permitem combinar o código real (por
exemplo, protocolos de replicação, a comunicação em grupo) e o simulado (por exemplo, base
de dados, rede), implementações reais devem ser desenvolvidas e testadas. Infelizmente, os
fornecedores de base de dados, geralmente, não possuem suporte nativo para o desenvolvimento
de protocolos de replicação de terceiros, forçando os desenvolvedores de protocolo a mudar o
motor de base de dados, quando o código fonte está disponível, ou a construir no middleware
abordagens que interceptam as solicitações do cliente. A primeira solução é difícil de manter já
que novas “releases” das bases de dados estão constantemente a serem produzidas, enquanto a
segunda representa um desenvolvimento árduo, pois obriga-nos a reconstruir vários recursos de
uma base de dados no middleware. Infelizmente, os protocolos de replicação baseados em comunicação em grupo, optimistas ou
conservadores, que foram propostos até agora apresentam inconvenientes que são um grande obstáculo
à sua utilização. Com os protocolos optimistas é difícil efectivar transacções na presença
de “hot-spots”, enquanto que os protocolos conservadores têm um fraco desempenho devido a
problemas de concorrência.
Nesta tese, propomos utilizar uma arquitetura genérica e uma interface de programação, intitulada
GAPI, para facilitar o desenvolvimento de diferentes estratégias de replicação. A ideia
consiste em fornecer extensões chaves para múltiplos SGBDs (Database Management Systems),
permitindo assim que uma estratégia de replicação possa ser desenvolvida uma única vez e testada
em várias bases de dados que possuam tais extensões, ou seja, aquelas que são “replicationfriendly”.
Para resolver os problemas acima referidos nos protocolos de replicação baseados
em comunicação em grupo, propomos utilizar um novo protocolo, intitulado AKARA. AKARA
garante a equidade, portanto, todas as operações têm uma oportunidade de serem efectivadas,
e garante um excelente desempenho ao tirar partido do paralelismo fornecido pelos motores
de base de dados. Finalmente, propomos um conjunto simples, mas abrangente de componentes
para construir protocolos de replicação baseados em comunicação em grupo e discutimos pontoschave
na sua concepção e implementação
A scalable decentralized group membership service for an asynchronous environment
This thesis presents a globally scalable, decentralized group membership service to manage client process groups operating in a distributed, asynchronous environment. This group membership service is totally scalable, handling process groups spanning a single LAN to groups spanning the entire global Internet equally well. It provides for nested and overlapping groups, as well as multiple groups residing on a single LAN. It also provides various Quality of Service selections which permit individual groups to be configured for an optimal balance between high quality with strong consistency semantics for group membership, and weaker consistency semantics with reduced complexity and latency. This thesis describes the complete design of the protocol used to implement the group membership service. It presents the design requirements and goals, and underlying assumptions about the network. The various Quality of Service selections provided by the group membership service are described in detail, as well as the interface between the process groups, the membership service, and the underlying network. The use of a hierarchical architecture to obtain the desired scalability, flexibility, and robustness is explained. A proof of correctness for the protocol is presented, and a partial implementation of the group membership service is describedhttp://archive.org/details/scalabledecentra00neelLieutenant, United States NavyApproved for public release; distribution is unlimited
Group communications and database replication:techniques, issues and performance
Databases are an important part of today's IT infrastructure: both companies and state institutions rely on database systems to store most of their important data. As we are more and more dependent on database systems, securing this key facility is now a priority. Because of this, research on fault-tolerant database systems is of increasing importance. One way to ensure the fault-tolerance of a system is by replicating it. Replication is a natural way to deal with failures: if one copy is not available, we use another one. However implementing consistent replication is not easy. Database replication is hardly a new area of research: the first papers on the subject are more than twenty years old. Yet how to build an efficient, consistent replicated database is still an open research question. Recently, a new approach to solve this problem has been proposed. The idea is to rely on some communication infrastructure called group communications. This infrastructure offers some high-level primitives that can help in the design and the implementation of a replicated database. While promising, this approach to database replication is still in its infancy. This thesis focuses on group communication-based database replication and strives to give an overall understanding of this topic. This thesis has three major contributions. In the structural domain, it introduces a classification of replication techniques. In the qualitative domain, an analysis of fault-tolerance semantics is proposed. Finally, in the quantitative domain, a performance evaluation of group communication-based database replication is presented. The classification gives an overview of the different means to implement database replication. Techniques described in the literature are sorted using this classification. The classification highlights structural similarities of techniques originating from different communities (database community and distributed system community). For each category of the classification, we also analyse the requirements imposed on the database component and group communication primitives that are needed to enforce consistency. Group communication-based database replication implies building a system from two different components: a database system and a group communication system. Fault-tolerance is an end-to-end property: a system built from two components tends to be as fault-tolerant as the weakest component. The analysis of fault-tolerance semantics show what fault-tolerance guarantee is ensured by group communication based replication techniques. Additionally a new faulttolerance guarantee, group-safety, is proposed. Group-safety is better suited to group communication-based database replication. We also show that group-safe replication techniques can offer improved performance. Finally, the performance evaluation offers a quantitative view of group communication based replication techniques. The performance of group communication techniques and classical database replication techniques is compared. The way those different techniques react to different loads is explored. Some optimisation of group communication techniques are also described and their performance benefits evaluated
Operating system fault tolerance support for real-time embedded applications
Tese de doutoramento em Electrónica Industrial (ramo de conhecimento em Informática Industrial)Fault tolerance is a means of achieving high dependability for critical and highavailability
systems. Despite the efforts to prevent and remove faults during the
development of these systems, the application of fault tolerance is usually required
because the hardware may fail during system operation and software faults are very
hard to eliminate completely.
One of the difficulties in implementing fault tolerance techniques is the lack of
support from operating systems and middleware. In most fault tolerant projects, the
programmer has to develop a fault tolerance implementation for each application.
This strong customization makes the fault-tolerant software costly and difficult to
implement and maintain. In particular, for small-scale embedded systems, the
introduction of fault tolerance techniques may also have impact on their restricted
resources, such as processing power and memory size.
The purpose of this research is to provide fault tolerance support for real-time
applications in small-scale embedded systems. The main approach of this thesis is to
develop and integrate a customizable and extendable fault tolerance framework into a
real-time operating system, in order to fulfill the needs of a large range of dependable
applications. Special attention is taken to allow the coexistence of fault tolerance with
real-time constraints. The utilization of the proposed framework features several
advantages over ad-hoc implementations, such as simplifying application-level
programming and improving the system configurability and maintainability.
In addition, this thesis also investigates the application of aspect-oriented
techniques to the development of real-time embedded fault-tolerant software. Aspect-
Oriented Programming (AOP) is employed to modularize all fault tolerant source code, following the principle of separation of concerns, and to integrate the proposed
framework into the operating system.
Two case studies are used to evaluate the proposed implementation in terms of
performance and resource costs. The results show that the overheads related to the
framework application are acceptable and the ones related to the AOP implementation
are negligible.Tolerância a falhas é um meio de obter-se alta confiabilidade para sistemas
críticos e de elevada disponibilidade. Apesar dos esforços para prevenir e remover
falhas durante o desenvolvimento destes sistemas, a aplicação de tolerância a falhas é
normalmente necessária, já que o hardware pode falhar durante a operação do sistema
e falhas de software são muito difíceis de eliminar completamente.
Uma das dificuldades na implementação de técnicas de tolerância a falhas é a
falta de suporte por parte dos sistemas operativos e middleware. Na maioria dos
projectos tolerantes a falhas, o programador deve desenvolver uma implementação de
tolerância a falhas para cada aplicação. Esta elevada adaptação torna o software
tolerante a falhas dispendioso e difícil de implementar e manter. Em particular, para
sistemas embebidos de pequena escala, a introdução de técnicas de tolerância a falhas
pode também ter impacto nos seus restritos recursos, tais como capacidade de
processamento e tamanho da memória.
O propósito desta tese é prover suporte à tolerância a falhas para aplicações de
tempo real em sistemas embebidos de pequena escala. A principal abordagem
utilizada nesta tese foi desenvolver e integrar uma framework tolerante a falhas,
customizável e extensível, a um sistema operativo de tempo real, a fim de satisfazer às
necessidades de uma larga gama de aplicações confiáveis. Especial atenção foi dada
para permitir a coexistência de tolerância a falhas com restrições de tempo real. A
utilização da framework proposta apresenta diversas vantagens sobre implementações
ad-hoc, tais como simplificar a programação a nível da aplicação e melhorar a
configurabilidade e a facilidade de manutenção do sistema.
Além disto, esta tese também investiga a aplicação de técnicas orientadas a
aspectos no desenvolvimento de software tolerante a falhas, embebido e de tempo
real. A Programação Orientada a Aspectos (POA) é empregada para segregar em módulos isolados todo o código fonte tolerante a falhas, seguindo o princípio da
separação de interesses, e para integrar a framework proposta com o sistema
operativo.
Dois casos de estudo são utilizados para avaliar a implementação proposta em
termos de desempenho e utilização de recursos. Os resultados mostram que os
acréscimos de recursos relativos à aplicação da framework são aceitáveis e os
relativos à implementação POA são insignificantes
Replicated execution of workflows
Workflows are the de facto standard for managing and optimizing business processes. Workflows allow businesses to automate interactions between business locations and partners residing anywhere on the planet. This, however, requires the workflows to be executed in a distributed and dynamic environment, where device and communication failures occur quite frequently. In case that a workflow execution becomes unavailable through such failures, the business operations that rely on the workflow might be hindered or even stopped, implying the loss of money. Consequently, availability is a key concern when using workflows in dynamic environments.
In this thesis, we propose replication schemes for workflow engines to ensure the availability of the workflows that are executed by these engines. Of course, a workflow that is executed by a replicated workflow engine has to yield the same result as a non-replicated execution of that workflow. To this end, we formally define the equivalence of a replicated and a non-replicated execution called Single-Execution-Equivalence. Subsequently, we present replication schemes for both imperative and declarative workflow languages. Imperative workflow languages, such as the Web Service Business Process Execution Language (WS-BPEL), specify the execution order of activities through an ordering relation and are the predominant way of specifying workflow models. We implement a proof-of-concept for demonstrating the compatibility of our replication schemes with current (imperative) workflow technology. Declarative workflow languages provide greater flexibility by allowing the reordering of the activities within a workflow at run-time. We exploit this by executing differently ordered replicas on several nodes in the network for improving availability further
Replication of non-deterministic objects
This thesis discusses replication of non-deterministic objects in distributed systems to achieve fault tolerance against crash failures. The objects replicated are the virtual nodes of a distributed application. Replication is viewed as an issue that is to be dealt with only during the configuration of a distributed application and that should not affect the development of the application. Hence, replication of virtual nodes should be transparent to the application. Like all measures to achieve fault tolerance, replication introduces redundancy in the system. Not surprisingly, the main difficulty is guaranteeing the consistency of all replicas such that they behave in the same way as if the object was not replicated (replication transparency). This is further complicated if active objects (like virtual nodes) are replicated, and these objects themselves can be clients of still further objects in the distributed application. The problems of replication of active non-deterministic objects are analyzed in the context of distributed Ada 95 applications. The ISO standard for Ada 95 defines a model for distributed execution based on remote procedure calls (RPC). Virtual nodes in Ada 95 use this as their sole communication paradigm, but they may contain tasks to execute activities concurrently, thus making the execution potentially non-deterministic due to implicit timing dependencies. Such non-determinism cannot be avoided by choosing deterministic tasking policies. I present two different approaches to maintain replica consistency despite this non-determinism. In a first approach, I consider the run-time support of Ada 95 as a black box (except for the part handling remote communications). This corresponds to a non-deterministic computation model. I show that replication of non-deterministic virtual nodes requires that remote procedure calls are implemented as nested transactions. Unfortunately, effects of failures are not local to the replicas of a virtual node: when a failure occurs, nested remote calls made to other virtual nodes must be undone. Also, using transactional semantics for RPCs necessitates a compromise regarding transparency: the application must identify global state for it cannot be determined reliably in an automatic way. Further study reveals that this approach cannot be implemented in a transparent way at all because the consistency criterion of Ada 95 (linearizability) is much weaker than that of transactions (serializability). An execution of remote procedure calls as transactions may thus lead to incompatibilities with the semantics of the programming language. If remotely called subprograms on a replicated virtual node perform partial operations, i.e., entry calls on global protected objects, deadlocks that cannot be broken can occur in certain cases. Such deadlocks do not occur when the virtual node is not replicated. The transactional semantics of RPCs must therefore be exposed to the application. A second approach is based on a piecewise deterministic computation model, i.e., the execution of a virtual node is seen as a sequence of deterministic state intervals. Whenever a non-deterministic event occurs, a new state interval is started. I study replica organization under this computation model (semi-active replication). In this model, all non-deterministic decisions are made on one distinguished replica (the leader), while all other replicas (the followers) are forced to follow the same sequence of non-deterministic events. I show that it suffices to synchronize the followers with the leader upon each observable event, i.e., when the leader sends a message to some other virtual node. It is not necessary to synchronize upon each and every non-deterministic event — which would incur a prohibitively high overhead. Non-deterministic events occurring on the leader between observable events are logged and sent to the followers just before the leader executes an observable event. Consequently, it is guaranteed that the followers will reach the same state as the leader, and thus the effects of failures remain mostly local to the replicas. A prototype implementation called RAPIDS (Replicated Ada Partitions In Distributed Systems) serves as a proof of concept for this second approach, demonstrating its feasibility. RAPIDS is an Ada 95 implementation of a replication manager for semi-active replication for the GNAT development system for Ada 95. It is entirely contained within the run-time support and hence largely transparent for the application
Fault-tolerant software: dependability/performance trade-offs, concurrency and system support
PhD ThesisAs the use of computer systems becomes more and more widespread in applications
that demand high levels of dependability, these applications themselves are growing in
complexity in a rapid rate, especially in the areas that require concurrent and distributed
computing. Such complex systems are very prone to faults and errors. No matter how
rigorously fault avoidance and fault removal techniques are applied, software design
faults often remain in systems when they are delivered to the customers. In fact,
residual software faults are becoming the significant underlying cause of system
failures and the lack of dependability. There is tremendous need for systematic
techniques for building dependable software, including the fault tolerance techniques
that ensure software-based systems to operate dependably even when potential faults
are present. However, although there has been a large amount of research in the area of
fault-tolerant software, existing techniques are not yet sufficiently mature as a practical
engineering discipline for realistic applications. In particular, they are often inadequate
when applied to highly concurrent and distributed software.
This thesis develops new techniques for building fault-tolerant software, addresses the
problem of achieving high levels of dependability in concurrent and distributed object
systems, and studies system-level support for implementing dependable software. Two
schemes are developed - the t/(n-l)-VP approach is aimed at increasing software
reliability and controlling additional complexity, while the SCOP approach presents an
adaptive way of dynamically adjusting software reliability and efficiency aspects. As a
more general framework for constructing dependable concurrent and distributed
software, the Coordinated Atomic (CA) Action scheme is examined thoroughly. Key
properties of CA actions are formalized, conceptual model and mechanisms for
handling application level exceptions are devised, and object-based diversity
techniques are introduced to cope with potential software faults. These three schemes
are evaluated analytically and validated by controlled experiments. System-level
support is also addressed with a multi-level system architecture. An architectural
pattern for implementing fault-tolerant objects is documented in detail to capture
existing solutions and our previous experience. An industrial safety-critical application,
the Fault-Tolerant Production Cell, is used as a case study to examine most of the
concepts and techniques developed in this research.ESPRIT