557 research outputs found
Authenticating Operation-based History in Collaborative Systems
International audienceWithin last years multi-synchronous collaborative editing systems became widely used. Multi-synchronous collaboration maintains multiple, simultaneous streams of activity which continually diverge and synchronized. These streams of activity are represented by means of logs of operations, i.e. user modifications. A malicious user might tamper his log of operations. At the moment of synchronization with other streams, the tampered log might generate wrong results. In this paper, we propose a solution relying on hash-chain based authenticators for authenticating logs that ensure the authenticity, the integrity of logs, and the user accountability. We present algorithms to construct authenticators and verify logs. We prove their correctness and provide theoretical and practical evaluations
Operating System Support for Redundant Multithreading
Failing hardware is a fact and trends in microprocessor design indicate that the fraction of hardware suffering from permanent and transient faults will continue to increase in future chip generations. Researchers proposed various solutions to this issue with different downsides: Specialized hardware components make hardware more expensive in production and consume additional energy at runtime. Fault-tolerant algorithms and libraries enforce specific programming models on the developer. Compiler-based fault tolerance requires the source code for all applications to be available for recompilation. In this thesis I present ASTEROID, an operating system architecture that integrates applications with different reliability needs.
ASTEROID is built on top of the L4/Fiasco.OC microkernel and extends the system with Romain, an operating system service that transparently replicates user applications. Romain supports single- and multi-threaded applications without requiring access to the application's source code. Romain replicates applications and their resources completely and thereby does not rely on hardware extensions, such as ECC-protected memory. In my thesis I describe how to efficiently implement replication as a form of redundant multithreading in software. I develop mechanisms to manage replica resources and to make multi-threaded programs behave deterministically for replication.
I furthermore present an approach to handle applications that use shared-memory channels with other programs. My evaluation shows that Romain provides 100% error detection and more than 99.6% error correction for single-bit flips in memory and general-purpose registers. At the same time, Romain's execution time overhead is below 14% for single-threaded applications running in triple-modular redundant mode. The last part of my thesis acknowledges that software-implemented fault tolerance methods often rely on the correct functioning of a certain set of hardware and software components, the Reliable Computing Base (RCB).
I introduce the concept of the RCB and discuss what constitutes the RCB of the ASTEROID system and other fault tolerance mechanisms. Thereafter I show three case studies that evaluate approaches to protecting RCB components and thereby aim to achieve a software stack that is fully protected against hardware errors
Optimistic replication
Data replication is a key technology in distributed data sharing systems, enabling higher availability and performance. This paper surveys optimistic replication algorithms that allow replica contents to diverge in the short term, in order to support concurrent work practices and to tolerate failures in low-quality communication links. The importance of such techniques is increasing as collaboration through wide-area and mobile networks becomes popular. Optimistic replication techniques are different from traditional “pessimistic ” ones. Instead of synchronous replica coordination, an optimistic algorithm propagates changes in the background, discovers conflicts after they happen and reaches agreement on the final contents incrementally. We explore the solution space for optimistic replication algorithms. This paper identifies key challenges facing optimistic replication systems — ordering operations, detecting and resolving conflicts, propagating changes efficiently, and bounding replica divergence — and provides a comprehensive survey of techniques developed for addressing these challenges
Extended Fault Taxonomy of SOA-Based Systems
Service Oriented Architecture (SOA) is considered as a standard for enterprise software development. The main characteristics of SOA are dynamic discovery and composition of software services in a heterogeneous environment. These properties pose newer challenges in fault management of SOA-based systems (SBS). A proper understanding of different faults in an SBS is very necessary for effective fault handling. A comprehensive three-fold fault taxonomy is presented here that covers distributed, SOA specific and non-functional faults in a holistic manner. A comprehensive fault taxonomy is a key starting point for providing techniques and methods for accessing the quality of a given system. In this paper, an attempt has been made to outline several SBSs faults into a well-structured taxonomy that may assist developers to plan suitable fault repairing strategies. Some commonly emphasized fault recovery strategies are also discussed. Some challenges that may occur during fault handling of SBSs are also mentioned
Issues in building mobile-aware applications with the Rover Toolkit
Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1996.Includes bibliographical references (p. 69-73).by Joshua A. Tauber.M.S
Object replication in a distributed system
PhD ThesisA number of techniques have been proposed for the construction of fault—tolerant
applications. One of these techniques is to replicate vital system resources so that if one
copy fails sufficient copies may still remain operational to allow the application to
continue to function. Interactions with replicated resources are inherently more complex
than non—replicated interactions, and hence some form of replication transparency is
necessary. This may be achieved by employing replica consistency protocols to mask replica
failures and maintain consistency of state between functioning replicas.
To achieve consistency between replicas it is necessary to ensure that all replicas
receive the same set of messages in the same order, despite failures at the senders and
receivers. This can be accomplished by making use of order preserving reliable
communication protocols. However, we shall show how it can be more efficient to use
unordered reliable communication and to impose ordering at the application level, by
making use of syntactic knowledge of the application.
This thesis develops techniques for replicating objects: in general this is harder than
replicating data, as objects (which can contain data) can contain calls on other objects.
Handling replicated objects is essentially the same as handling replicated computations,
and presents more problems than simply replicating data. We shall use the concept of the
object to provide transparent replication to users: a user will interact with only a single
object interface which hides the fact that the object is actually replicated.
The main aspects of the replication scheme presented in this thesis have been fully
implemented and tested. This includes the design and implementation of a replicated
object invocation protocol and the algorithms which ensure that (replicated) atomic
actions can manipulate replicated objects.Research Studentship, Science and Engineering Research Council.
Esprit Project 2267 (Integrated Systems Architecture)
- …