5,397 research outputs found

    An approach to rollback recovery of collaborating mobile agents

    Get PDF
    Fault-tolerance is one of the main problems that must be resolved to improve the adoption of the agents' computing paradigm. In this paper, we analyse the execution model of agent platforms and the significance of the faults affecting their constituent components on the reliable execution of agent-based applications, in order to develop a pragmatic framework for agent systems fault-tolerance. The developed framework deploys a communication-pairs independent check pointing strategy to offer a low-cost, application-transparent model for reliable agent- based computing that covers all possible faults that might invalidate reliable agent execution, migration and communication and maintains the exactly-one execution property

    Tractable reliable communication in compromised networks

    Get PDF
    Reliable communication is a fundamental primitive in distributed systems prone to Byzantine (i.e. arbitrary, and possibly malicious) failures to guarantee the integrity, delivery, and authorship of the messages exchanged between processes. Its practical adoption strongly depends on the system assumptions. Several solutions have been proposed so far in the literature implementing such a primitive, but some lack in scalability and/or demand topological network conditions computationally hard to be verified. This thesis aims to investigate and address some of the open problems and challenges implementing such a communication primitive. Specifically, we analyze how a reliable communication primitive can be implemented in 1) a static distributed system where a subset of processes is compromised, 2) a dynamic distributed system where part of the processes is Byzantine faulty, and 3) a static distributed system where every process can be compromised and recover. We define several more efficient protocols and we characterize alternative network conditions guaranteeing their correctness

    Fault-Tolerant Mobile Agent Execution

    Get PDF

    ATOMAS : a transaction-oriented open multi agent system; final report

    Get PDF
    The electronic marketplace of the future will consist of a large number of services located on an open, distributed and heterogeneous platform, which will be used by an even larger number of clients. Mobile Agent Systems are considered to be a precondition for the evolution of such an electronic market. They can provide a flexible infrastructure for this market, i.e. for the installation of new services by service agents as well as for the utilization of these services by client agents. Mobile Agent Systems basically consist of a number of locations and agents. Locations are (logical) abstractions for (physical) hosts in a computer network. The network of locations serves as a unique and homogeneous platform, while the underlying network of hosts may be heterogeneous and widely distributed. Locations therefore have to guarantee independence from the underlying hard- and software. To make the Mobile Agent System an open platform, the system furthermore has to guarantee security of hosts against malicious attacks

    Fault-tolerant and transactional mobile agent execution

    Get PDF
    Mobile agents constitute a computing paradigm of a more general nature than the widely used client/server computing paradigm. A mobile agent is essentially a computer program that acts autonomously on behalf of a user and travels through a network of heterogeneous machines. However, the greater flexibility of the mobile agent paradigm compared to the client/server computing paradigm comes at additional costs. These costs include, among others, the additional complexity of developing and managing mobile agent-based applications. This additional complexity comprises such issues as reliability. Before mobile agent technology can appear at the core of tomorrow's business applications, reliability mechanisms for mobile agents must be established. In this context, fault tolerance and transaction support are mechanisms of considerable importance. Various approaches to fault tolerance and transaction support exist. They have different strengths and weaknesses, and address different environments. Because of this variety, it is often difficult for the application programmer to choose the approach best suited to an application. This thesis introduces a classification of current approaches to fault-tolerant and transactional mobile agent execution. The classification, which focuses on algorithmic aspects, aims at structuring the field of fault-tolerant and transactional mobile agent execution and facilitates an understanding of the properties and weaknesses of particular approaches. In a distributed system, any software or hardware component may be subject to failures. A single failing component (e.g., agent or machine) may prevent the agent from proceeding with its execution. Worse yet, the current state of the agent and even its code may be lost. We say that the agent execution is blocked. For the agent owner, i.e., the person or application that has configured the agent, the agent does not return. To achieve fault-tolerance, the agent owner can try to detect the failure of the agent, and upon such an event launch a new agent. However, this requires the ability to correctly detect the crash of the agent, i.e., to distinguish between a failed agent and an agent that is delayed by slow processors or slow communication links. Unfortunately, this cannot be achieved in systems such as the Internet. An agent owner who tries to detect the failure of the agent thus cannot prevent the case in which the agent is mistakenly assumed to have crashed. In this case, launching a new agent leads to multiple executions of the agent, i.e., to the violation of the desired exactly-once property of agent execution. Although this may be acceptable for certain applications (e.g., applications whose operations do not have side-effects), others clearly forbid it. In this context, launching a new agent is a form of replication. In general, replication prevents blocking, but may lead to multiple executions of the agent, i.e., to a violation of the exactly-once execution property. This thesis presents an approach that ensures the exactly-once execution property using a simple principle: the mobile agent execution is modeled as a sequence of agreement problems. This model leads to an approach based on two well-known building blocks: consensus and reliable broadcast. We validate this approach with the implementation of FATOMAS, a Java-based FAult-TOlerant Mobile Agent System, and measure its overhead. Transactional mobile agents execute the mobile agent as a transaction. Assume, for instance, an agent whose task is to buy an airline ticket, book a hotel room, and rent a car at the flight destination. The agent owner naturally wants all three operations to succeed or none at all. Clearly, the rental car at the destination is of no use if no flight to the destination is available. On the other hand, the airline ticket may be useless if no rental car is available. The mobile agent's operations thus need to execute atomically, i.e., either all of them or none at all. Execution atomicity also needs to be ensured in the event of failures of hardware or software components. The approach presented in this thesis is non-blocking. A non-blocking transactional mobile agent execution has the important advantage that it can make progress despite failures. In a blocking transactional mobile agent execution, by contrast, progress is only possible when the failed component has recovered. Until then, the acquired locks generally cannot be freed. As no other transactional mobile agents can acquire the lock, overall system throughput is dramatically reduced. The present approach reuses the work on fault-tolerant mobile agent execution to prevent blocking. We have implemented the proposed approach and present the evaluation results

    Byzantine Gathering in Networks

    Full text link
    This paper investigates an open problem introduced in [14]. Two or more mobile agents start from different nodes of a network and have to accomplish the task of gathering which consists in getting all together at the same node at the same time. An adversary chooses the initial nodes of the agents and assigns a different positive integer (called label) to each of them. Initially, each agent knows its label but does not know the labels of the other agents or their positions relative to its own. Agents move in synchronous rounds and can communicate with each other only when located at the same node. Up to f of the agents are Byzantine. A Byzantine agent can choose an arbitrary port when it moves, can convey arbitrary information to other agents and can change its label in every round, in particular by forging the label of another agent or by creating a completely new one. What is the minimum number M of good agents that guarantees deterministic gathering of all of them, with termination? We provide exact answers to this open problem by considering the case when the agents initially know the size of the network and the case when they do not. In the former case, we prove M=f+1 while in the latter, we prove M=f+2. More precisely, for networks of known size, we design a deterministic algorithm gathering all good agents in any network provided that the number of good agents is at least f+1. For networks of unknown size, we also design a deterministic algorithm ensuring the gathering of all good agents in any network but provided that the number of good agents is at least f+2. Both of our algorithms are optimal in terms of required number of good agents, as each of them perfectly matches the respective lower bound on M shown in [14], which is of f+1 when the size of the network is known and of f+2 when it is unknown

    Global Versus Local Computations: Fast Computing with Identifiers

    Full text link
    This paper studies what can be computed by using probabilistic local interactions with agents with a very restricted power in polylogarithmic parallel time. It is known that if agents are only finite state (corresponding to the Population Protocol model by Angluin et al.), then only semilinear predicates over the global input can be computed. In fact, if the population starts with a unique leader, these predicates can even be computed in a polylogarithmic parallel time. If identifiers are added (corresponding to the Community Protocol model by Guerraoui and Ruppert), then more global predicates over the input multiset can be computed. Local predicates over the input sorted according to the identifiers can also be computed, as long as the identifiers are ordered. The time of some of those predicates might require exponential parallel time. In this paper, we consider what can be computed with Community Protocol in a polylogarithmic number of parallel interactions. We introduce the class CPPL corresponding to protocols that use O(nlog⁡kn)O(n\log^k n), for some k, expected interactions to compute their predicates, or equivalently a polylogarithmic number of parallel expected interactions. We provide some computable protocols, some boundaries of the class, using the fact that the population can compute its size. We also prove two impossibility results providing some arguments showing that local computations are no longer easy: the population does not have the time to compare a linear number of consecutive identifiers. The Linearly Local languages, such that the rational language (ab)∗(ab)^*, are not computable.Comment: Long version of SSS 2016 publication, appendixed version of SIROCCO 201
    • 

    corecore