11,340 research outputs found

    An overview of the Amoeba distributed operating system

    Get PDF
    As hardware prices continue to drop rapidly, building large computer systems by interconnecting substantial numbers of microcomputers becomes increasingly attractive. Many techniques for interconnecting the hardware, such as Ethernet [Metcalfe and Boggs, 1976], ring nets [Farber and Larson, 1972], packet switching, and shared memory are well understood, but the corresponding software techniques are poorly understood. The design of general purpose distributed operating systems is one of the key research issues for the 1980s

    A metaobject architecture for fault-tolerant distributed systems : the FRIENDS approach

    Get PDF
    The FRIENDS system developed at LAAS-CNRS is a metalevel architecture providing libraries of metaobjects for fault tolerance, secure communication, and group-based distributed applications. The use of metaobjects provides a nice separation of concerns between mechanisms and applications. Metaobjects can be used transparently by applications and can be composed according to the needs of a given application, a given architecture, and its underlying properties. In FRIENDS, metaobjects are used recursively to add new properties to applications. They are designed using an object oriented design method and implemented on top of basic system services. This paper describes the FRIENDS software-based architecture, the object-oriented development of metaobjects, the experiments that we have done, and summarizes the advantages and drawbacks of a metaobject approach for building fault-tolerant system

    FADI: a fault-tolerant environment for open distributed computing

    Get PDF
    FADI is a complete programming environment that serves the reliable execution of distributed application programs. FADI encompasses all aspects of modern fault-tolerant distributed computing. The built-in user-transparent error detection mechanism covers processor node crashes and hardware transient failures. The mechanism also integrates user-assisted error checks into the system failure model. The nucleus non-blocking checkpointing mechanism combined with a novel selective message logging technique delivers an efficient, low-overhead backup and recovery mechanism for distributed processes. FADI also provides means for remote automatic process allocation on the distributed system nodes

    Deceit: A flexible distributed file system

    Get PDF
    Deceit, a distributed file system (DFS) being developed at Cornell, focuses on flexible file semantics in relation to efficiency, scalability, and reliability. Deceit servers are interchangeable and collectively provide the illusion of a single, large server machine to any clients of the Deceit service. Non-volatile replicas of each file are stored on a subset of the file servers. The user is able to set parameters on a file to achieve different levels of availability, performance, and one-copy serializability. Deceit also supports a file version control mechanism. In contrast with many recent DFS efforts, Deceit can behave like a plain Sun Network File System (NFS) server and can be used by any NFS client without modifying any client software. The current Deceit prototype uses the ISIS Distributed Programming Environment for all communication and process group management, an approach that reduces system complexity and increases system robustness

    FRIENDS - A flexible architecture for implementing fault tolerant and secure distributed applications

    Get PDF
    FRIENDS is a software-based architecture for implementing fault-tolerant and, to some extent, secure applications. This architecture is composed of sub-systems and libraries of metaobjects. Transparency and separation of concerns is provided not only to the application programmer but also to the programmers implementing metaobjects for fault tolerance, secure communication and distribution. Common services required for implementing metaobjects are provided by the sub-systems. Metaobjects are implemented using object-oriented techniques and can be reused and customised according to the application needs, the operational environment and its related fault assumptions. Flexibility is increased by a recursive use of metaobjects. Examples and experiments are also described

    Robust data storage in a network of computer systems

    Get PDF
    PhD ThesisRobustness of data in this thesis is taken to mean reliable storage of data and also high availability of data .objects in spite of the occurrence of faults. Algorithms and data structures which can be used to provide such robustness in the presence of various disk, processor and communication network failures are described. Reliable storage of data at individual nodes in a network of computer systems is based on the use of a stable storage mechanism combined with strategies which are used to help ensure crash resis- tance of file operations in spite of the use of buffering mechan- isms by operating systems. High availability of data in the net- work is maintained by replicating data on different computers and mutual consistency between replicas is ensured in spite of network partitioning. A stable storage system which provides atomicity for more complex data structures instead of the usual fixed size page has been designed and implemented and its performance evaluated. A crash resistant file system has also been implemented and evaluated. Many of the techniques presented here are used in the design of what we call CRES (Crash-resistant, Replicated and Stable) storage. CRES storage provides fault tolerance facilities for various disk and processor faults. It also provides fault tolerance facilities for network partitioning through the provision of an algorithm for the update and merge of a partitioned data storage system

    Clockwise: a mixed-media file system

    Get PDF
    This paper presents Clockwise, a mixed-media file system. The primary goal of Clockwise is to provide a storage architecture that supports the storage and retrieval of best-effort and real-time file system data. Clockwise provides an abstraction called a dynamic partition that groups lists of related (large) blocks on one or more disks. Dynamic partitions can grow and shrink in size and reading or writing of dynamic partitions can be scheduled explicitly. With respect to scheduling, Clockwise uses a novel strategy to pre-calculate schedule slack time and it schedules best-effort requests before queued real-time requests in this slack tim

    Communication and control in small batch part manufacturing

    Get PDF
    This paper reports on the development of a real-time control network as an integrated part of a shop floor control system for small batch part manufacturing. The shop floor control system is called the production control system (PCS). The PCS aims at an improved control of small batch part manufacturing systems, enabling both a more flexible use of resources and a decrease in the economical batch size. For this, the PCS integrates various control functions such as scheduling, dispatching, workstation control and monitoring, whilst being connected on-line to the production equipment on the shop floor. The PCS can be applied irrespective of the level of automation on the shop floor. The control network is an essential part of the PCS, as it provides a real-time connection between the different modules (computers) of the PCS, which are geographically distributed over the shop floor. An overview of the requirements of such a control network is given. The description of the design includes the services developed, the protocols used and the physical layout of the network. A prototype of the PCS, including the control network, has been installed and tested in a pilot plant. The control network has proven that it can supply a manufacturing environment, consisting of equipment from different vendors with different levels of automation, with a reliable, low cost, real-time communication facility
    • …
    corecore