5,713 research outputs found

    ISIS and META projects

    Get PDF
    The ISIS project has developed a new methodology, virtual synchony, for writing robust distributed software. High performance multicast, large scale applications, and wide area networks are the focus of interest. Several interesting applications that exploit the strengths of ISIS, including an NFS-compatible replicated file system, are being developed. The META project is distributed control in a soft real-time environment incorporating feedback. This domain encompasses examples as diverse as monitoring inventory and consumption on a factory floor, and performing load-balancing on a distributed computing system. One of the first uses of META is for distributed application management: the tasks of configuring a distributed program, dynamically adapting to failures, and monitoring its performance. Recent progress and current plans are reported

    Programming with process groups: Group and multicast semantics

    Get PDF
    Process groups are a natural tool for distributed programming and are increasingly important in distributed computing environments. Discussed here is a new architecture that arose from an effort to simplify Isis process group semantics. The findings include a refined notion of how the clients of a group should be treated, what the properties of a multicast primitive should be when systems contain large numbers of overlapping groups, and a new construct called the causality domain. A system based on this architecture is now being implemented in collaboration with the Chorus and Mach projects

    The ISIS project: Fault-tolerance in large distributed systems

    Get PDF
    The semi-annual status report covers activities of the ISIS project during the second half of 1989. The project had several independent objectives: (1) At the level of the ISIS Toolkit, ISIS release V2.0 was completed, containing bypass communication protocols. Performance of the system is greatly enhanced by this change, but the initial software release is limited in some respects. (2) The Meta project focused on the definition of the Lomita programming language for specifying rules that monitor sensors for conditions of interest and triggering appropriate reactions. This design was completed, and implementation of Lomita is underway on the Meta 2.0 platform. (3) The Deceit file system effort completed a prototype. It is planned to make Deceit available for use in two hospital information systems. (4) A long-haul communication subsystem project was completed and can be used as part of ISIS. This effort resulted in tools for linking ISIS systems on different LANs together over long-haul communications lines. (5) Magic Lantern, a graphical tool for building application monitoring and control interfaces, is included as part of the general ISIS releases

    Time constrained fault tolerance and management framework for k-connected distributed wireless sensor networks based on composite event detection

    Get PDF
    Wireless sensor nodes themselves are exceptionally complex systems where a variety of components interact in a complex way. In enterprise scenarios it becomes highly important to hide the details of the underlying sensor networks from the applications and to guarantee a minimum level of reliability of the system. One of the challenges faced to achieve this level of reliability is to overcome the failures frequently faced by sensor networks due to their tight integration with the environment. Failures can generate false information, which may trigger incorrect business processes, resulting in additional costs. Sensor networks are inherently fault prone due to the shared wireless communication medium. Thus, sensor nodes can lose synchrony and their programs can reach arbitrary states. Since on-site maintenance is not feasible, sensor network applications should be local and communication-efficient self-healing. Also, as per my knowledge, no such general framework exist that addresses all the fault issues one may encounter in a WSN, based on the extensive, exhaustive and comprehensive literature survey in the related areas of research. As one of the main goals of enterprise applications is to reduce the costs of business processes, a complete and more general Fault Tolerance and management framework for a general WSN, irrespective of the node types and deployment conditions is proposed which would help to mitigate the propagation of failures in a business environment, reduce the installation and maintenance costs and to gain deployment flexibility to allow for unobtrusive installation

    Formal and Fault Tolerant Design

    Get PDF
    Software quality and reliability were verified for a long time at the post-implementation level (test, fault sce-nario ...). The design of embedded systems and digital circuits is more and more complex because of inte-gration density, heterogeneity. Now almost ¾ of the digital circuits contain at least one processor, that is, can execute software code. In other words, co-design is the most usual case and traditional verification by simu-lation is no more practical. Moreover, the increase in integration density comes with a decrease in the reliabil-ity of the components. So fault detection, diagnostics techniques, introspection are essential for defect toler-ance, fault tolerance and self repair of safety-critical systems. The use of a formal specification language is considered as the foundation of a real validation. What we would like to emphasize is that refinement (from an abstract model to the point where the system will be implemented) could be and should be formal too in order to ensure the traceability of requirements, to man-age such development projects and so to design fault-tolerant systems correct by proven construction. Such a thorough approach can be achieved by the automation or semi-automation of the refinement process. We have studied how to ensure the traceability of these requirements in a component-based approach. Re-liability, fault tolerance can be seen here as particular refinement steps. For instance, a given formal specifi-cation of a system/component may be refined by adding redundancy (data, computation, component) and be verified to be fault-tolerant w.r.t. some given fault scenarios. A self-repair component can be defined as the refinement of its original form enhanced with error detection. We describe in this paper the PCSI project (Zero Defect Systems) based on B Method, VHDL and PSL. The three modeling approaches can collaborate together and guarantee the codesign of embedded systems for which the requirements and the fault-tolerant aspects are taken into account for the beginning and formally verified all along the implementation process

    Behavior Trees in Robotics and AI: An Introduction

    Full text link
    A Behavior Tree (BT) is a way to structure the switching between different tasks in an autonomous agent, such as a robot or a virtual entity in a computer game. BTs are a very efficient way of creating complex systems that are both modular and reactive. These properties are crucial in many applications, which has led to the spread of BT from computer game programming to many branches of AI and Robotics. In this book, we will first give an introduction to BTs, then we describe how BTs relate to, and in many cases generalize, earlier switching structures. These ideas are then used as a foundation for a set of efficient and easy to use design principles. Properties such as safety, robustness, and efficiency are important for an autonomous system, and we describe a set of tools for formally analyzing these using a state space description of BTs. With the new analysis tools, we can formalize the descriptions of how BTs generalize earlier approaches. We also show the use of BTs in automated planning and machine learning. Finally, we describe an extended set of tools to capture the behavior of Stochastic BTs, where the outcomes of actions are described by probabilities. These tools enable the computation of both success probabilities and time to completion

    A Taxonomy of Workflow Management Systems for Grid Computing

    Full text link
    With the advent of Grid and application technologies, scientists and engineers are building more and more complex applications to manage and process large data sets, and execute scientific experiments on distributed resources. Such application scenarios require means for composing and executing complex workflows. Therefore, many efforts have been made towards the development of workflow management systems for Grid computing. In this paper, we propose a taxonomy that characterizes and classifies various approaches for building and executing workflows on Grids. We also survey several representative Grid workflow systems developed by various projects world-wide to demonstrate the comprehensiveness of the taxonomy. The taxonomy not only highlights the design and engineering similarities and differences of state-of-the-art in Grid workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure
    corecore