7,386 research outputs found
An approach to rollback recovery of collaborating mobile agents
Fault-tolerance is one of the main problems that must be resolved to improve the adoption of the agents' computing paradigm. In this paper, we analyse the execution model of agent platforms and the significance of the faults affecting their constituent components on the reliable execution of agent-based applications, in order to develop a pragmatic framework for agent systems fault-tolerance. The developed framework deploys a communication-pairs independent check pointing strategy to offer a low-cost, application-transparent model for reliable agent- based computing that covers all possible faults that might invalidate reliable agent execution, migration and communication and maintains the exactly-one execution property
Fault Tolerant Adaptive Parallel and Distributed Simulation through Functional Replication
This paper presents FT-GAIA, a software-based fault-tolerant parallel and
distributed simulation middleware. FT-GAIA has being designed to reliably
handle Parallel And Distributed Simulation (PADS) models, which are needed to
properly simulate and analyze complex systems arising in any kind of scientific
or engineering field. PADS takes advantage of multiple execution units run in
multicore processors, cluster of workstations or HPC systems. However, large
computing systems, such as HPC systems that include hundreds of thousands of
computing nodes, have to handle frequent failures of some components. To cope
with this issue, FT-GAIA transparently replicates simulation entities and
distributes them on multiple execution nodes. This allows the simulation to
tolerate crash-failures of computing nodes. Moreover, FT-GAIA offers some
protection against Byzantine failures, since interaction messages among the
simulated entities are replicated as well, so that the receiving entity can
identify and discard corrupted messages. Results from an analytical model and
from an experimental evaluation show that FT-GAIA provides a high degree of
fault tolerance, at the cost of a moderate increase in the computational load
of the execution units.Comment: arXiv admin note: substantial text overlap with arXiv:1606.0731
On using the CAMA framework for developing open mobile fault tolerant agent systems
The paper introduces the Cama (Context-Aware Mobile Agents) framework intended for developing large-scale mobile applications using the agent paradigm. Cama provides a powerful set of abstractions, a supporting middleware and an adaptation layer allowing developers to address the main characteristics of the mobile applications: openness, asynchronous and anonymous communication, fault tolerance, device mobility. It ensures recursive system structuring using location, scope, agent and role abstractions. Cama supports system fault tolerance through exception handling and structured agent coordination. The applicability of the framework is demonstrated using an ambient lecture scenario - the first part of an ongoing work on a series of ambient campus applications
On developing open mobile fault tolerant agent systems
The paper introduces the CAMA (Context-Aware Mobile Agents) framework intended for developing large-scale mobile applications using the agent paradigm. CAMA provides a powerful set of abstractions, a supporting middleware and an adaptation layer allowing developers to address the main characteristics of the mobile applications: openness, asynchronous and anonymous communication, fault tolerance, and device mobility. It ensures recursive system structuring using location, scope, agent, and role abstractions. CAMA supports system fault tolerance through exception handling and structured agent coordination within nested scopes. The applicability of the framework is demonstrated using an ambient lecture scenario - the first part of an ongoing work on a series of ambient campus applications. This scenario is developed starting from a thorough definition of the traceable requirements including the fault tolerance requirements. This is followed by the design phase at which the CAMA abstractions are applied. At the implementation phase, the CAMA middleware services are used through a provided API. This work is part of the FP6 IST RODIN project on Rigorous Open Development Environment for Complex Systems
Reliable Fault Tolerance System for Service Composition in Mobile Ad Hoc Network
A Due to the rapid development of smart processing mobile devices, Mobile applications are exploring the use of web services in MANETs to satisfy the user needs. Complex user needs are satisfied by the service composition where a complex service is created by combining one or more atomic services. Service composition has a significant challenge in MANETs due to its limited bandwidth, constrained energy sources, dynamic node movement and often suffers from node failures. These constraints increase the failure rate of service composition. To overcome these, we propose Reliable Fault Tolerant System for Service Composition in MANETs (RFTSC) which makes use of the checkpointing technique for service composition in MANETs. We propose fault policies for each fault in service composition when the faults occur. Failure of services in the service composition process is recovered locally by making use of Checkpointing system and by using discovered services which satisfies the QoS constraints. A Multi-Service Tree (MST) is proposed to recover failed services with O(1) time complexity. Simulation result shows that the proposed approach is efficient when compared to existing approaches
An Improved Approximate Consensus Algorithm in the Presence of Mobile Faults
This paper explores the problem of reaching approximate consensus in
synchronous point-to-point networks, where each pair of nodes is able to
communicate with each other directly and reliably. We consider the mobile
Byzantine fault model proposed by Garay '94 -- in the model, an omniscient
adversary can corrupt up to nodes in each round, and at the beginning of
each round, faults may "move" in the system (i.e., different sets of nodes may
become faulty in different rounds). Recent work by Bonomi et al. '16 proposed a
simple iterative approximate consensus algorithm which requires at least
nodes. This paper proposes a novel technique of using "confession" (a mechanism
to allow others to ignore past behavior) and a variant of reliable broadcast to
improve the fault-tolerance level. In particular, we present an approximate
consensus algorithm that requires only nodes, an
improvement over the state-of-the-art algorithms.
Moreover, we also show that the proposed algorithm is optimal within a family
of round-based algorithms
- âŠ