100,290 research outputs found

    Measuring the BDARX architecture by agent oriented system a case study

    Get PDF
    Distributed systems are progressively designed as multi-agent systems that are helpful in designing high strength complex industrial software. Recently, distributed systems cooperative applications are openly access, dynamic and large scales. Nowadays, it hardly seems necessary to emphasis on the potential of decentralized software solutions. This is because the main benefit lies in the distributed nature of information, resources and action. On the other hand, the progression in multi agent systems creates new challenges to the traditional methodologies of fault-tolerance that typically relies on centralized and offline solution. Research on multi-agent systems had gained attention for designing software that operates in distributed and open environments, such as the Internet. DARX (Dynamic Agent Replication eXtension) is one of the architecture which aimed at building reliable software that would prove to be both flexible and scalable and also aimed to provide adaptive fault tolerance by using dynamic replication methodologies. Therefore, the enhancement of DARX known as BDARX can provide dynamic solution of byzantine faults for the agent based systems that embedded DARX. The BDARX architecture improves the fault tolerance ability of multi-agent systems in long run and strengthens the software to be more robust against such arbitrary faults. The BDARX provide the solution for the Byzantine fault tolerance in DARX by making replicas on the both sides of communication agents by using BFT protocol for agent systems instead of making replicas only on server end and assuming client as failure free. This paper shows that the dynamic behaviour of agents avoid us from making discrimination between server and client replicas

    Attributes of fault-tolerant distributed file systems

    Get PDF
    Fault tolerance in distributed file systems will be investigated by analyzing recovery techniques and concepts implemented within the following models of distributed systems: pool-processor model and user-server model. The research presented provides an overview of fault tolerance characteristics and mechanisms within current implementations and summarizes future directions for fault tolerant distributed file systems

    Self-Stabilization, Byzantine Containment, and Maximizable Metrics: Necessary Conditions

    Get PDF
    Self-stabilization is a versatile approach to fault-tolerance since it permits a distributed system to recover from any transient fault that arbitrarily corrupts the contents of all memories in the system. Byzantine tolerance is an attractive feature of distributed systems that permits to cope with arbitrary malicious behaviors. We consider the well known problem of constructing a maximum metric tree in this context. Combining these two properties leads to some impossibility results. In this paper, we provide two necessary conditions to construct maximum metric tree in presence of transients and (permanent) Byzantine faults

    A Novel Technique for Task Re-Allocation in Distributed Computing System

    Get PDF
    A distributed computing is software system in which components are located on different attached computers can communicate and organize their actions by transferring messages. A task applied on the distributed system must be reliable and feasible. The distributed system for instance grid networks, robotics, air traffic control systems, etc. exceedingly depends on time. If not detected accurately and recovered at the proper time, a single error in real time distributed system can cause a whole system failure. Fault-tolerance is the key method which is mostly used to provide continuous reliability in these systems. There are some challenges in distributed computing system such as resource sharing, transparency, dependability, Complex mappings, concurrency, Fault tolerance etc. In this paper, we focus on fault tolerance which is responsible for the degradation of the system. A novel technique is proposed based upon reliability to overcome fault tolerance problem and re-allocate the task. DOI: 10.17762/ijritcc2321-8169.15080

    Aspect-oriented fault tolerance for real-time embedded systems

    Get PDF
    Real-time embedded systems for safety-critical applications have to introduce fault tolerance mechanisms in order to cope with hardware and software errors. Fault tolerance is usually applied by means of redundancy and diversity. Redundant hardware implies the establishment of a distributed system executing a set of fault tolerance strategies by software, and may also employ some form of diversity, by using different variants or versions for the same processing. This paper describes our approach to introduce fault tolerance in distributed embedded systems applications, using aspect-oriented programming (AOP). A real-time operating system sup-porting middleware thread communication was integrated to a fault tolerant framework. The introduction of fault tolerance in the system is performed by AOP at the application thread level. The advantages of this approach include higher modularization, less efforts for legacy systems evolution and better configurability for testing and product line development. This work has been tested and evaluated successfully in several fault tolerant configurations and presented no significant performance or memory footprint costs.Fundação para a Ciência e a Tecnologia (FCT

    Resource efficient redundancy using quorum-based cycle routing in optical networks

    Get PDF
    In this paper we propose a cycle redundancy technique that provides optical networks almost fault-tolerant point-to-point and multipoint-to-multipoint communications. The technique more importantly is shown to approximately halve the necessary light-trail resources in the network while maintaining the fault-tolerance and dependability expected from cycle-based routing. For efficiency and distributed control, it is common in distributed systems and algorithms to group nodes into intersecting sets referred to as quorum sets. Optimal communication quorum sets forming optical cycles based on light-trails have been shown to flexibly and efficiently route both point-to-point and multipoint-to-multipoint traffic requests. Commonly cycle routing techniques will use pairs of cycles to achieve both routing and fault-tolerance, which uses substantial resources and creates the potential for underutilization. Instead, we intentionally utilize redundancy within the quorum cycles for fault-tolerance such that almost every point-to-point communication occurs in more than one cycle. The result is a set of cycles with 96.60% - 99.37% fault coverage, while using 42.9% - 47.18% fewer resources.Comment: 17th International Conference on Transparent Optical Networks (ICTON), 5-9 July 2015. arXiv admin note: substantial text overlap with arXiv:1608.05172, arXiv:1608.0516

    The ISIS project: Fault-tolerance in large distributed systems

    Get PDF
    The semi-annual status report covers activities of the ISIS project during the second half of 1989. The project had several independent objectives: (1) At the level of the ISIS Toolkit, ISIS release V2.0 was completed, containing bypass communication protocols. Performance of the system is greatly enhanced by this change, but the initial software release is limited in some respects. (2) The Meta project focused on the definition of the Lomita programming language for specifying rules that monitor sensors for conditions of interest and triggering appropriate reactions. This design was completed, and implementation of Lomita is underway on the Meta 2.0 platform. (3) The Deceit file system effort completed a prototype. It is planned to make Deceit available for use in two hospital information systems. (4) A long-haul communication subsystem project was completed and can be used as part of ISIS. This effort resulted in tools for linking ISIS systems on different LANs together over long-haul communications lines. (5) Magic Lantern, a graphical tool for building application monitoring and control interfaces, is included as part of the general ISIS releases
    • …
    corecore