43 research outputs found

    Efficient Reliable Group Communication for Distributed Systems

    Get PDF
    Many applications can profit from broadcast communication, but few operating systems provide primitives that make broadcast communication available to user applications. In this paper we introduce primitives for broadcast communication that have been integrated with the Amoeba distributed operating system. The semantics of the broadcast primitives are simple, powerful, and easy to understand. Our primitives, for example, guarantee total ordering of broadcast messages. The proposed primitives are also efficient: if a network supports physical multicast, a reliable broadcast can be done in just slightly more than two messages on the average, so, the performance of a reliable broadcast is roughly comparable to that of a remote procedure call. In addition, the primitives are flexible: user applications can, for example, trade performance against fault tolerance. 1

    Group Communication in Amoeba and its Applications

    Get PDF
    Unlike many other operating systems, Amoeba is a distributed operating system that provides group communication (i.e., one-to-many communication). We wil

    An Evaluation of the Amoeba Group Communication System

    Get PDF
    The Amoeba group communication system has two unique aspects: (1) it uses a sequencer-based protocol with negative acknowledgements for achieving a total order on all group messages; and (2) users choose the degree of fault tolerance they desire. This paper reports on our design decisions in retrospect, the performance of the Amoeba group system, and our experiences using the system. We conclude that sequencer-based group protocols achieve high performance (comparable to Amoeba's fast remote procedure call implementation), that the scalability of our sequencer-based protocols is limited by message processing time, and that the flexibility and modularity of user-level implementations of protocols is likely to outweigh the potential performance loss

    Resource provision in object oriented distributed systems

    Get PDF

    R2PC: fault-tolerance made easy

    Get PDF
    Fault-tolerance is a concept that is becoming more and more important as computers are increasingly being used in application areas such as process control, air-traffic control and communication systems. However, the construction of fault-tolerant software remains a very difficult task, as it requires extensive knowledge and experience on the part of the designers of the system. The basics of the Remote Procedure Call (RPC) protocol and its many variants are a fundamental mechanism that provides the adequate level of abstraction for the construction of distributed applications and release the programmers from the burden of dealing with low level networking protocols. However, the standard definition of the protocol does not provide us with semantics that are sufficiently transparent to deal with unexpected hardware and software faults, i.e. the programmer has to deal with possible problems that may occur. To deal with this problem, different reliable variations of the RPC protocol have been defined. This dissertation introduces a new reliable protocol - R2PC - with the following characteristics. • Symmetric treatment of client and server processes. • Use of concurrently processed nested calls in stateful servers. • The achievement of failure transparency at the application level

    Group-oriented coordination models for distributed client-server computing

    Get PDF
    This paper describes group-oriented control models for distributed client-server interactions. These models transparently coordinate requests for services that involve multiple servers, such as queries across distributed databases. Specific capabilities include: decomposing and replicating client requests; dispatching request subtasks or copies to independent, networked servers; and combining server results into a single response for the client. The control models were implemented by combining request broker and process group technologies with an object-oriented communication middleware tool. The models are illustrated in the context of a distributed operations support application for space-based systems

    Naming issues in the design of transparently distributed operating systems

    Get PDF
    PhD ThesisNaming is of fundamental importance in the design of transparently distributed operating systems. A transparently distributed operating system should be functionally equivalent to the systems of which it is composed. In particular, the names of remote objects should be indistinguishable from the names oflocal objects. In this thesis we explore the implication that this recursive notion of transparency has for the naming mechanisms provided by an operating system. In particular, we show that a recursive naming system is more readily extensible than a flat naming system by demonstrating that it is in precisely those areas in which a system is not recursive that transparency is hardest to achieve. However, this is not so much a problem of distribution so much as a problem of scale. A system which does not scale well internally will not extend well to a distributed system. Building a distributed system out of existing systems involves joining the name spaces of the individual systems together. When combining name spaces it is important to preserve the identity of individual objects. Although unique identifiers may be used to distinguish objects within a single name space, we argue that it is difficult if not impossible in practice to guarantee the uniqueness of such identifiers between name spaces. Instead, we explore the possibility of Using hierarchical identifiers, unique only within a localised context. However, We show that such identifiers cannot be used in an arbitrary naming graph without compromising the notion of identity and hence violating the semantics of the underlying system. The only alternative is to sacrifice a deterministic notion of identity by using random identifiers to approximate global uniqueness with a know probability of failure (which can be made arbitrarily small if the overall size of the system is known in advance).UK Science and Engineering Research Council

    Management of object-oriented action-based distributed programs

    Get PDF
    Phd ThesisThis thesis addresses the problem of managing the runtime behaviour of distributed programs. The thesis of this work is that management is fundamentally an information processing activity and that the object model, as applied to actionbased distributed systems and database systems, is an appropriate representation of the management information. In this approach, the basic concepts of classes, objects, relationships, and atomic transition systems are used to form object models of distributed programs. Distributed programs are collections of objects whose methods are structured using atomic actions, i.e., atomic transactions. Object models are formed of two submodels, each representing a fundamental aspect of a distributed program. The structural submodel represents a static perspective of the distributed program, and the control submodel represents a dynamic perspective of it. Structural models represent the program's objects, classes and their relationships. Control models represent the program's object states, events, guards and actions-a transition system. Resolution of queries on the distributed program's object model enable the management system to control certain activities of distributed programs. At a different level of abstraction, the distributed program can be seen as a reactive system where two subprograms interact: an application program and a management program; they interact only through sensors and actuators. Sensors are methods used to probe an object's state and actuators are methods used to change an object's state. The management program is capable to prod the application program into action by activating sensors and actuators available at the interface of the application program. Actions are determined by management policies that are encoded in the management program. This way of structuring the management system encourages a clear modularization of application and management distributed programs, allowing better separation of concerns. Managemental concerns can be dealt with by the management program, functional concerns can be assigned to the application program. The object-oriented action-based computational model adopted by the management system provides a natural framework for the implementation of faulttolerant distributed programs. Object orientation provides modularity and extensibility through object encapsulation. Atomic actions guarantee the consistency of the objects of the distributed program despite concurrency and failures. Replication of the distributed program provides increased fault-tolerance by guaranteeing the consistent progress of the computation, even though some of the replicated objects can fail. A prototype management system based on the management theory proposed above has been implemented atop Arjuna; an object-oriented programming system which provides a set of tools for constructing fault-tolerant distributed programs. The management system is composed of two subsystems: Stabilis, a management system for structural information, and Vigil, a management system for control information. Example applications have been implemented to illustrate the use of the management system and gather experimental evidence to give support to the thesis.CNPq (Consellho Nacional de Desenvolvimento Cientifico e Tecnol6gico, Brazil): BROADCAST (Basic Research On Advanced Distributed Computing: from Algorithms to SysTems)

    Management of concurrency in a reliable object-oriented computing system

    Get PDF
    PhD ThesisModern computing systems support concurrency as a means of increasing the performance of the system. However, the potential for increased performance is not without its problems. For example, lost updates and inconsistent retrieval are but two of the possible consequences of unconstrained concurrency. Many concurrency control techniques have been designed to combat these problems; this thesis considers the applicability of some of these techniques in the context of a reliable object-oriented system supporting atomic actions. The object-oriented programming paradigm is one approach to handling the inherent complexity of modern computer programs. By modeling entities from the real world as objects which have well-defined interfaces, the interactions in the system can be carefully controlled. By structuring sequences of such interactions as atomic actions, then the consistency of the system is assured. Objects are encapsulated entities such that their internal representation is not externally visible. This thesis postulates that this encapsulation should also include the capability for an object to be responsible for its own concurrency control. Given this latter assumption, this thesis explores the means by which the property of type-inheritance possessed by object-oriented languages can be exploited to allow programmers to explicitly control the level of concurrency an object supports. In particular, a object-oriented concurrency controller based upon the technique of two-phase locking is described and implemented using type-inheritance. The thesis also shows how this inheritance-based approach is highly flexible such that the basic concurrency control capabilities can be adopted unchanged or overridden with more type-specific concurrency control if requiredUK Science and Engineering Research Council, Serc/Alve
    corecore