12,295 research outputs found
Prototype of Fault Adaptive Embedded Software for Large-Scale Real-Time Systems
This paper describes a comprehensive prototype of large-scale fault adaptive
embedded software developed for the proposed Fermilab BTeV high energy physics
experiment. Lightweight self-optimizing agents embedded within Level 1 of the
prototype are responsible for proactive and reactive monitoring and mitigation
based on specified layers of competence. The agents are self-protecting,
detecting cascading failures using a distributed approach. Adaptive,
reconfigurable, and mobile objects for reliablility are designed to be
self-configuring to adapt automatically to dynamically changing environments.
These objects provide a self-healing layer with the ability to discover,
diagnose, and react to discontinuities in real-time processing. A generic
modeling environment was developed to facilitate design and implementation of
hardware resource specifications, application data flow, and failure mitigation
strategies. Level 1 of the planned BTeV trigger system alone will consist of
2500 DSPs, so the number of components and intractable fault scenarios involved
make it impossible to design an `expert system' that applies traditional
centralized mitigative strategies based on rules capturing every possible
system state. Instead, a distributed reactive approach is implemented using the
tools and methodologies developed by the Real-Time Embedded Systems group.Comment: 2nd Workshop on Engineering of Autonomic Systems (EASe), in the 12th
Annual IEEE International Conference and Workshop on the Engineering of
Computer Based Systems (ECBS), Washington, DC, April, 200
States in Process Calculi
Formal reasoning about distributed algorithms (like Consensus) typically
requires to analyze global states in a traditional state-based style. This is
in contrast to the traditional action-based reasoning of process calculi.
Nevertheless, we use domain-specific variants of the latter, as they are
convenient modeling languages in which the local code of processes can be
programmed explicitly, with the local state information usually managed via
parameter lists of process constants. However, domain-specific process calculi
are often equipped with (unlabeled) reduction semantics, building upon a rich
and convenient notion of structural congruence. Unfortunately, the price for
this convenience is that the analysis is cumbersome: the set of reachable
states is modulo structural congruence, and the processes' state information is
very hard to identify. We extract from congruence classes of reachable states
individual state-informative representatives that we supply with a proper
formal semantics. As a result, we can now freely switch between the process
calculus terms and their representatives, and we can use the stateful
representatives to perform assertional reasoning on process calculus models.Comment: In Proceedings EXPRESS/SOS 2014, arXiv:1408.127
Rendezvous in Networks in Spite of Delay Faults
Two mobile agents, starting from different nodes of an unknown network, have
to meet at the same node. Agents move in synchronous rounds using a
deterministic algorithm. Each agent has a different label, which it can use in
the execution of the algorithm, but it does not know the label of the other
agent. Agents do not know any bound on the size of the network. In each round
an agent decides if it remains idle or if it wants to move to one of the
adjacent nodes. Agents are subject to delay faults: if an agent incurs a fault
in a given round, it remains in the current node, regardless of its decision.
If it planned to move and the fault happened, the agent is aware of it. We
consider three scenarios of fault distribution: random (independently in each
round and for each agent with constant probability 0 < p < 1), unbounded adver-
sarial (the adversary can delay an agent for an arbitrary finite number of
consecutive rounds) and bounded adversarial (the adversary can delay an agent
for at most c consecutive rounds, where c is unknown to the agents). The
quality measure of a rendezvous algorithm is its cost, which is the total
number of edge traversals. For random faults, we show an algorithm with cost
polynomial in the size n of the network and polylogarithmic in the larger label
L, which achieves rendezvous with very high probability in arbitrary networks.
By contrast, for unbounded adversarial faults we show that rendezvous is not
feasible, even in the class of rings. Under this scenario we give a rendezvous
algorithm with cost O(nl), where l is the smaller label, working in arbitrary
trees, and we show that \Omega(l) is the lower bound on rendezvous cost, even
for the two-node tree. For bounded adversarial faults, we give a rendezvous
algorithm working for arbitrary networks, with cost polynomial in n, and
logarithmic in the bound c and in the larger label L
- …