5 research outputs found

    A review of experiences with reliable multicast

    Get PDF

    A middleware service for fault-tolerant group communications

    Get PDF
    PhD ThesisMany distributed applications require multicast group communication services, enabling an entity to interact with a group of other entities. Providing the reliability and ordering guarantees required by group based applications is not a trivial task in distributed systems where computation and communication delays might not be known accurately. Furthermore, the approaches available to support these guarantees are diverse. The choice of approach may significantly effect the performance of an application and/or may not be suitable for some application types. Nowadays, distributed applications are frequently built as a Middleware service. The Thesis develops techniques for providing group communication support in Middleware environments. A group communication service has been designed and implemented in such a way as not to hinder the interoperability/portability of applications built using it. The service provides a variety of functions that may be tailored to suit many different types of applications. Group communication protocols are presented that ensure reliability and ordering guarantees. Furthermore, the reliability and ordering guarantees of such protocols may be tailored to suit a wide variety of applications. Mechanisms that provide a variety of approaches to inter-member and inter-group interactions that are suitable for satisfying the requirements of many different types of applications (e.g., fault- tolerant, collaborative) are also supported. The service can work over local and wide area networks (Internet).Hewlett Packard laboratories Engineering and Physical Science Research Counci

    The CORBA object group service:a service approach to object groups in CORBA

    Get PDF
    Distributed computing is one of the major trends in the computer industry. As systems become more distributed, they also become more complex and have to deal with new kinds of problems, such as partial crashes and link failures. To answer the growing demand in distributed technologies, several middleware environments have emerged during the last few years. These environments however lack support for "one-to-many" communication primitives; such primitives greatly simplify the development of several types of applications that have requirements for high availability, fault tolerance, parallel processing, or collaborative work. One-to-many interactions can be provided by group communication. It manages groups of objects and provides primitives for sending messages to all members of a group, with various reliability and ordering guarantees. A group constitutes a logical addressing facility: messages can be issued to a group without having to know the number, identity, or location of individual members. The notion of group has proven to be very useful for providing high availability through replication: a set of replicas constitutes a group, but are viewed by clients as a single entity in the system. This thesis aims at studying and proposing solutions to the problem of object group support in object-based middleware environments. It surveys and evaluates different approaches to this problem. Based on this evaluation, we propose a system model and an open architecture to add support for object groups to the CORBA middle- ware environment. In doing so, we provide the application developer with powerful group primitives in the context of a standard object-based environment. This thesis contributes to ongoing standardization efforts that aim to support fault tolerance in CORBA, using entity redundancy. The group architecture proposed in this thesis — the Object Group Service (OGS) — is based on the concept of component integration. It consists of several distinct components that provide various facilities for reliable distributed computing and that are reusable in isolation. Group support is ultimately provided by combining these components. OGS defines an object-oriented framework of CORBA components for reliable distributed systems. The OGS components include a group membership service, which keeps track of the composition of object groups, a group multicast service, which provides delivery of messages to all group members, a consensus service, which allows several CORBA objects to resolve distributed agreement problems, and a monitoring service, which provides distributed failure detection mechanisms. OGS includes support for dynamic group membership and for group multicast with various reliability and ordering guarantees. It defines interfaces for active and primary-backup replication. In addition, OGS proposes several execution styles and various levels of transparency. A prototype implementation of OGS has been realized in the context of this thesis. This implementation is available for two commercial ORBs (Orbix and VisiBroker). It relies solely on the CORBA specification, and is thus portable to any compliant ORB. Although the main theme of this thesis deals with system architecture, we have developed some original algorithms to implement group support in OGS. We analyze these algorithms and implementation choices in this dissertation, and we evaluate them in terms of efficiency. We also illustrate the use of OGS through example applications

    Dynamic Upgrade of Distributed Software Components

    Get PDF
    Die Aktualisierung von komplexen Telekommunikationssystemen, die sich durch die ihnen eigene Verteiltheit und hohe Kosten bei System-Nichtverfügbarkeit auszeichnen, ist ein komplizierter und fehleranfälliger Wartungsprozess. Noch stärkere Herausforderungen bergen solche Software-Aktualisierungen, die die Systemverfügbarkeit nicht beeinträchtigen sollen. Dynamic Upgrade ist eine Wartungstechnik, die das Verwalten und die Durchführung von Software-Aktualisierung automatisiert und damit den Betrieb des Systems während der Wartungszeit nicht unterbricht. In dieser Arbeit wird das Dynamic Upgrade als ein Sonderfall der Bereitstellung und Inbetriebnahme (Deployment) von Software betrachtet, in dem Teile der einen Dienst repräsentierenden Software durch neue Versionen im laufenden Betrieb ersetzt werden. Die Problemstellung des Dynamic Upgrade wird anhand einer vom Autor erarbeiteten Taxonomie erläutert, die die Entwurfsmöglichkeiten für ein System zur Unterstützung von Dynamic Upgrade hinsichtlich dreier Systemaspekte klassifiziert: Deployment, Evolution und Zuverlässigkeit (Dependability). Mit Hilfe dieser Taxonomie lassen sich auch andere Systeme zur Unterstützung von Dynamic Upgrade miteinander vergleichen. Aufbauend auf einem ausführlichen Vergleich über existierende Ansätze zur Unterstützung von Dynamic Upgrade, wird in der vorliegenden Arbeit eine Lösung entwickelt und dargestellt, die Dynamic Upgrade in verteilten komponentenbasierten Software-Systemen ermöglicht. Ausgehend von der Problemanalyse wird mit Hilfe des Unified Process ein als Deployment and Upgrade Facility bezeichnetes Modell entwickelt, das sowohl die benötigten Leistungsfähigkeiten eines Dynamic Upgrade unterstützenden Systems als auch Eigenschaften von aktualisierbaren Software-Komponenten beschreibt. Dieses Modell ist Plattform-unabhängig und einsetzbar für mehrere unterliegende Middleware-Technologien. Das Modell wird in einem Java-basierten prototypischen Rahmenwerk programmiert und um plattformspezifische Mechanismen auf der Jgroup/ARM Middleware erweitert. Das Rahmenwerk umfasst allgemeine Entwurfslösungen und ?muster, die sich für die Konstruktion einer Unterstützung für Dynamic Upgrade eignen. Es erlaubt die Kontrolle der Lebenszyklen von Aktualisierungsprozessen und ihre Koordination im Zielsystem. Darüber hinaus definiert es eine Reihe von Unterstützungsmechanismen und Algorithmen für den dynamischen Aktualisierungsprozess, der gegebenenfalls mit unterschiedlichen Zielsetzungen und unter verschiedenen Randbedingungen erfolgen soll. Insbesondere wird ein Aktualisierungsalgorithmus für replizierte Software-Komponenten dargestellt. Das entwickelte Rahmenwerk wird zwecks Plausibilitätsprüfung der dargestellten Ansätze und zur Auswertung der Auswirkungen der Dynamic Upgrade unterstützenden Mechanismen im Hinblick auf Systemperformanz in mehreren Experimenten eingesetzt. Diese quantitative Evaluierung der Experimente führt zu einer Spezifikationen eines einfachen Bewertungsmaßstabs (Benchmark), der sich zum Vergleich von Dynamic Upgrade unterstützenden Systemen eignet.Upgrading complex telecommunication software systems, characterized by their inherent distribution and a very high cost of system unavailability, is a difficult and error-prone maintenance activity. Even more challenging are such software upgrades that do not compromise the system availability. Dynamic upgrades is a technique, which automates performing and managing upgrades so that the software system remains operational during the upgrade time. In this thesis, the dynamic upgrade is considered as a special case of software deployment, in which a running service has to be replaced with its new version. The problems of dynamic upgrades are introduced using a novel taxonomy that classifies the design issues to be solved when building support for dynamic upgrade with regard to three system aspects: deployment, evolution and dependability and provides a reference to comparing other systems supporting dynamic upgrades. An extensive and thorough survey of existing approaches to dynamic upgrades follows and, furthermore, is as a starting point to designing a solution supporting dynamic upgrades in distributed component-based software systems. Derived from the problem analysis, a model called Deployment and Upgrade Facility describing the capabilities needed for managing and performing dynamic upgrades as well as properties of upgradable software components is developed using the Unified Process approach. The model is platform independent and can be used with a range of underlying middleware technologies. The model is implemented in a Java-based prototypical framework and extended with platform specific mechanisms on top of the JGroup/ARM middleware. The framework captures common design solutions and patterns for building a support for dynamic upgrade. The framework allows for controlling life-cycle and coordination of upgrade processes in the system. It also defines a number of supporting mechanisms and algorithms for the upgrade process. A special attention is drawn to an upgrade algorithm for replicated software components for achieving a synergy of replication techniques and dynamic upgrade . The developed framework is used to validate the feasibility of the approach and to measure the overhead of the mechanisms supporting dynamic upgrade with regard to the performance of the system being upgraded in a number of practical experiments. This quantitative evaluation of the experiments leads to a specification of a simple benchmark for systems supporting dynamic upgrades

    Dpcp (discard Past Consider Present) - A Novel Approach To Adaptive Fault Detection In Distributed Systems

    No full text
    Fault detection is a fundamental issue for fault tolerance in distributed systems. This paper presents the DPCP (Discard Past Consider Present) approach, that discards the last elapsed times of fault detection messages and considers only the current one. By this way, DPCP allows to perform a fast, accurated and scalable adaptive fault monitoring for asynchronous distributed systems. The scalability comes from the parameter Minimum-TimeUnit, that controls the minimum frequency of the fault monitoring messages. The fastness and accuracy of fault monitoring come from the changing of timeout and monitoring interval values as soon as the system workload and the Minimum TimeUnit allow. Some DPCP experiments on ACE+TAO were made to observe DPCP behavior on changing network workloads.7682Aguilera, M.K., Chen, W., Toueg, S., (1998) Failure Detection and Consensus in the Crash-Recovery Model, , Technical Report 98-1676, Department of Computer Science, Cornell UniversityChandra, T.D., Toueg, S., Unreliable failure detectors for reliable distributed systems (1996) Journal of the ACM, 43 (2), pp. 225-267Felber, P., (1998) The CORBA Object Group Service: A Service Approach to Object Groups in CORBA, , PhD thesis, École Polytechnique Fédérale de LausanneMacêdo, R.J.A., Failure detection in asynchronous distributed systems II Test and Fault Tolerance Workshop (II WTF 2000). Curitiba, Brazil, Jul. 2000, , http://www.lasid.ufba.br/public/artigos/Narasimhan, P., Moser, L.E., Melliar-Smith, P.M., Replica consistency of CORBA objects in partitionable distributed systems (1997) Distributed Systems Engineering, 4, pp. 139-150Natarajan, B., Gokhale, A., Schmidt, D.C., DOORS: Towards high-performance fault-tolerant CORBA Proceedings of the 2nd International Symposium on Distributed Objects and Applications (DOA '00). Antwerp, Belgium, Sept. 2000, OMG(2000) Fault Tolerant CORBA Specification, , Object Management GroupAprSchmidt, D.C., Douglas C. Schmidt's Welcome Page, , http://newport.ece.uci.edu/~schmidt/Sergent, N., Défago, X., Schiper, A., Failure detectors: Implementation issues and impact on consensus performance (1999), Technical Report SSC/1999/019. École Polytechnique Fédérale de LausanneSotoma, I., Madeira, E.R.M., ADAPTATION - Algorithms to ADAPTive FAulT MonItOriNg and their implementation on CORBA Proceedings IEEE of the 3rd International Symposium on Distributed Objects and Applications (DOA'01). Rome, Italy, Sep. 2001, , (accepted for publication
    corecore