Search CORE

180 research outputs found

How robust are distributed systems

Author: Birman Kenneth P.
Publication venue
Publication date: 01/06/1989
Field of study

A distributed system is made up of large numbers of components operating asynchronously from one another and hence with imcomplete and inaccurate views of one another's state. Load fluctuations are common as new tasks arrive and active tasks terminate. Jointly, these aspects make it nearly impossible to arrive at detailed predictions for a system's behavior. It is important to the successful use of distributed systems in situations in which humans cannot provide the sorts of predictable realtime responsiveness of a computer, that the system be robust. The technology of today can too easily be affected by worn programs or by seemingly trivial mechanisms that, for example, can trigger stock market disasters. Inventors of a technology have an obligation to overcome flaws that can exact a human cost. A set of principles for guiding solutions to distributed computing problems is presented

NASA Technical Reports Server

eCommons@Cornell

Maintaining consistency in distributed systems

Author: Birman Kenneth P.
Publication venue
Publication date: 01/01/1991
Field of study

In systems designed as assemblies of independently developed components, concurrent access to data or data structures normally arises within individual programs, and is controlled using mutual exclusion constructs, such as semaphores and monitors. Where data is persistent and/or sets of operation are related to one another, transactions or linearizability may be more appropriate. Systems that incorporate cooperative styles of distributed execution often replicate or distribute data within groups of components. In these cases, group oriented consistency properties must be maintained, and tools based on the virtual synchrony execution model greatly simplify the task confronting an application developer. All three styles of distributed computing are likely to be seen in future systems - often, within the same application. This leads us to propose an integrated approach that permits applications that use virtual synchrony with concurrent objects that respect a linearizability constraint, and vice versa. Transactional subsystems are treated as a special case of linearizability

CiteSeerX

NASA Technical Reports Server

eCommons@Cornell

The ISIS Project: Real Experience with a Fault Tolerant Programming System

Author: Birman Kenneth
Cooper Robert
Publication venue
Publication date: 01/07/1990
Field of study

The ISIS project has developed a distributed programming toolkit and a collection of higher level applications based on these tools. ISIS is now in use at more than 300 locations world-wise. The lessons (and surprises) gained from this experience with the real world are discussed

NASA Technical Reports Server

eCommons@Cornell

How to securely replicate services (preliminary version)

Author: Birman Kenneth
Reiter Michael
Publication venue
Publication date
Field of study

A method is presented for constructing replicated services that retain their availability and integrity despite several servers and clients being corrupted by an intruder, in addition to others failing benignly. More precisely, a service is replicated by 'n' servers in such a way that a correct client will accept a correct server's response if, for some prespecified parameter, k, at least k servers are correct and fewer than k servers are correct. The issue of maintaining causality among client requests is also addressed. A security breach resulting from an intruder's ability to effect a violation of causality in the sequence of requests processed by the service is illustrated. An approach to counter this problem is proposed that requires that fewer than k servers are corrupt and, to ensure liveness, that k is less than or = n - 2t, where t is the assumed maximum total number of both corruptions and benign failures suffered by servers in any system run. An important and novel feature of these schemes is that the client need not be able to identify or authenticate even a single server. Instead, the client is required only to possess at most two public keys for the service

NASA Technical Reports Server

Deceit: A flexible distributed file system

Author: Birman Kenneth
Marzullo Keith
Siegel Alex
Publication venue
Publication date: 01/11/1989
Field of study

Deceit, a distributed file system (DFS) being developed at Cornell, focuses on flexible file semantics in relation to efficiency, scalability, and reliability. Deceit servers are interchangeable and collectively provide the illusion of a single, large server machine to any clients of the Deceit service. Non-volatile replicas of each file are stored on a subset of the file servers. The user is able to set parameters on a file to achieve different levels of availability, performance, and one-copy serializability. Deceit also supports a file version control mechanism. In contrast with many recent DFS efforts, Deceit can behave like a plain Sun Network File System (NFS) server and can be used by any NFS client without modifying any client software. The current Deceit prototype uses the ISIS Distributed Programming Environment for all communication and process group management, an approach that reduces system complexity and increases system robustness

NASA Technical Reports Server

eCommons@Cornell

Integrating security in a group oriented distributed system

Author: Birman Kenneth
Gong LI
Reiter Michael
Publication venue
Publication date: 01/02/1992
Field of study

A distributed security architecture is proposed for incorporation into group oriented distributed systems, and in particular, into the Isis distributed programming toolkit. The primary goal of the architecture is to make common group oriented abstractions robust in hostile settings, in order to facilitate the construction of high performance distributed applications that can tolerate both component failures and malicious attacks. These abstractions include process groups and causal group multicast. Moreover, a delegation and access control scheme is proposed for use in group oriented systems. The focus is the security architecture; particular cryptosystems and key exchange protocols are not emphasized

NASA Technical Reports Server

eCommons@Cornell

The ISIS project: Fault-tolerance in large distributed systems

Author: Birman Kenneth P.
Marzullo Keith
Publication venue
Publication date
Field of study

The semi-annual status report covers activities of the ISIS project during the second half of 1989. The project had several independent objectives: (1) At the level of the ISIS Toolkit, ISIS release V2.0 was completed, containing bypass communication protocols. Performance of the system is greatly enhanced by this change, but the initial software release is limited in some respects. (2) The Meta project focused on the definition of the Lomita programming language for specifying rules that monitor sensors for conditions of interest and triggering appropriate reactions. This design was completed, and implementation of Lomita is underway on the Meta 2.0 platform. (3) The Deceit file system effort completed a prototype. It is planned to make Deceit available for use in two hospital information systems. (4) A long-haul communication subsystem project was completed and can be used as part of ISIS. This effort resulted in tools for linking ISIS systems on different LANs together over long-haul communications lines. (5) Magic Lantern, a graphical tool for building application monitoring and control interfaces, is included as part of the general ISIS releases

NASA Technical Reports Server

Programming your way out of the past: ISIS and the META Project

Author: Birman Kenneth P.
Marzullo Keith
Publication venue
Publication date
Field of study

The ISIS distributed programming system and the META Project are described. The ISIS programming toolkit is an aid to low-level programming that makes it easy to build fault-tolerant distributed applications that exploit replication and concurrent execution. The META Project is reexamining high-level mechanisms such as the filesystem, shell language, and administration tools in distributed systems

NASA Technical Reports Server

Process membership in asynchronous environments

Author: Birman Kenneth P.
Ricciardi Aleta M.
Publication venue
Publication date: 01/02/1993
Field of study

The development of reliable distributed software is simplified by the ability to assume a fail-stop failure model. The emulation of such a model in an asynchronous distributed environment is discussed. The solution proposed, called Strong-GMP, can be supported through a highly efficient protocol, and was implemented as part of a distributed systems software project at Cornell University. The precise definition of the problem, the protocol, correctness proofs, and an analysis of costs are addressed

NASA Technical Reports Server

eCommons@Cornell

Programming with process groups: Group and multicast semantics

Author: Birman Kenneth P.
Cooper Robert
Gleeson Barry
Publication venue
Publication date: 29/01/1991
Field of study

Process groups are a natural tool for distributed programming and are increasingly important in distributed computing environments. Discussed here is a new architecture that arose from an effort to simplify Isis process group semantics. The findings include a refined notion of how the clients of a group should be treated, what the properties of a multicast primitive should be when systems contain large numbers of overlapping groups, and a new construct called the causality domain. A system based on this architecture is now being implemented in collaboration with the Chorus and Mach projects

NASA Technical Reports Server

eCommons@Cornell