45,350 research outputs found
Introduction to the special section on dependable network computing
Dependable network computing is becoming a key part of our daily economic and social life. Every day, millions of users and businesses are utilizing the Internet infrastructure for real-time electronic commerce transactions, scheduling important events, and building relationships. While network traffic and the number of users are rapidly growing, the mean-time between failures (MTTF) is surprisingly short; according to recent studies, in the majority of Internet backbone paths, the MTTF is 28 days. This leads to a strong requirement for highly dependable networks, servers, and software systems. The challenge is to build interconnected systems, based on available technology, that are inexpensive, accessible, scalable, and dependable. This special section provides insights into a number of these exciting challenges
Designing application software in wide area network settings
Progress in methodologies for developing robust local area network software has not been matched by similar results for wide area settings. The design of application software spanning multiple local area environments is examined. For important classes of applications, simple design techniques are presented that yield fault tolerant wide area programs. An implementation of these techniques as a set of tools for use within the ISIS system is described
Programming with process groups: Group and multicast semantics
Process groups are a natural tool for distributed programming and are increasingly important in distributed computing environments. Discussed here is a new architecture that arose from an effort to simplify Isis process group semantics. The findings include a refined notion of how the clients of a group should be treated, what the properties of a multicast primitive should be when systems contain large numbers of overlapping groups, and a new construct called the causality domain. A system based on this architecture is now being implemented in collaboration with the Chorus and Mach projects
A metaobject architecture for fault-tolerant distributed systems : the FRIENDS approach
The FRIENDS system developed at LAAS-CNRS is a metalevel architecture providing libraries of metaobjects for fault
tolerance, secure communication, and group-based distributed applications. The use of metaobjects provides a nice separation of concerns between mechanisms and applications. Metaobjects can be used transparently by applications and can be composed according to the needs of a given application, a given architecture, and its underlying properties. In FRIENDS, metaobjects are used recursively to add new properties to applications. They are designed using an object oriented design method and implemented on top of basic system services. This paper describes the FRIENDS software-based architecture, the object-oriented development of metaobjects, the experiments that we have done, and summarizes the advantages and drawbacks of a metaobject approach for building fault-tolerant system
Implementing fault tolerant applications using reflective object-oriented programming
Abstract: Shows how reflection and object-oriented programming can be used to ease the implementation of classical fault tolerance mechanisms in distributed applications. When the underlying runtime system does not provide fault tolerance transparently, classical approaches to implementing fault tolerance mechanisms often imply mixing functional programming with non-functional programming (e.g. error processing mechanisms). The use of reflection improves the transparency of fault tolerance mechanisms to the programmer and more generally provides a clearer separation between functional and non-functional programming. The implementations of some classical replication techniques using a reflective approach are presented in detail and illustrated by several examples, which have been prototyped on a network of Unix workstations. Lessons learnt from our experiments are drawn and future work is discussed
System Description for a Scalable, Fault-Tolerant, Distributed Garbage Collector
We describe an efficient and fault-tolerant algorithm for distributed cyclic
garbage collection. The algorithm imposes few requirements on the local
machines and allows for flexibility in the choice of local collector and
distributed acyclic garbage collector to use with it. We have emphasized
reducing the number and size of network messages without sacrificing the
promptness of collection throughout the algorithm. Our proposed collector is a
variant of back tracing to avoid extensive synchronization between machines. We
have added an explicit forward tracing stage to the standard back tracing stage
and designed a tuned heuristic to reduce the total amount of work done by the
collector. Of particular note is the development of fault-tolerant cooperation
between traces and a heuristic that aggressively reduces the set of suspect
objects.Comment: 47 pages, LaTe
- âŠ