12,295 research outputs found

    Prototype of Fault Adaptive Embedded Software for Large-Scale Real-Time Systems

    Get PDF
    This paper describes a comprehensive prototype of large-scale fault adaptive embedded software developed for the proposed Fermilab BTeV high energy physics experiment. Lightweight self-optimizing agents embedded within Level 1 of the prototype are responsible for proactive and reactive monitoring and mitigation based on specified layers of competence. The agents are self-protecting, detecting cascading failures using a distributed approach. Adaptive, reconfigurable, and mobile objects for reliablility are designed to be self-configuring to adapt automatically to dynamically changing environments. These objects provide a self-healing layer with the ability to discover, diagnose, and react to discontinuities in real-time processing. A generic modeling environment was developed to facilitate design and implementation of hardware resource specifications, application data flow, and failure mitigation strategies. Level 1 of the planned BTeV trigger system alone will consist of 2500 DSPs, so the number of components and intractable fault scenarios involved make it impossible to design an `expert system' that applies traditional centralized mitigative strategies based on rules capturing every possible system state. Instead, a distributed reactive approach is implemented using the tools and methodologies developed by the Real-Time Embedded Systems group.Comment: 2nd Workshop on Engineering of Autonomic Systems (EASe), in the 12th Annual IEEE International Conference and Workshop on the Engineering of Computer Based Systems (ECBS), Washington, DC, April, 200

    States in Process Calculi

    Full text link
    Formal reasoning about distributed algorithms (like Consensus) typically requires to analyze global states in a traditional state-based style. This is in contrast to the traditional action-based reasoning of process calculi. Nevertheless, we use domain-specific variants of the latter, as they are convenient modeling languages in which the local code of processes can be programmed explicitly, with the local state information usually managed via parameter lists of process constants. However, domain-specific process calculi are often equipped with (unlabeled) reduction semantics, building upon a rich and convenient notion of structural congruence. Unfortunately, the price for this convenience is that the analysis is cumbersome: the set of reachable states is modulo structural congruence, and the processes' state information is very hard to identify. We extract from congruence classes of reachable states individual state-informative representatives that we supply with a proper formal semantics. As a result, we can now freely switch between the process calculus terms and their representatives, and we can use the stateful representatives to perform assertional reasoning on process calculus models.Comment: In Proceedings EXPRESS/SOS 2014, arXiv:1408.127

    Rendezvous in Networks in Spite of Delay Faults

    Full text link
    Two mobile agents, starting from different nodes of an unknown network, have to meet at the same node. Agents move in synchronous rounds using a deterministic algorithm. Each agent has a different label, which it can use in the execution of the algorithm, but it does not know the label of the other agent. Agents do not know any bound on the size of the network. In each round an agent decides if it remains idle or if it wants to move to one of the adjacent nodes. Agents are subject to delay faults: if an agent incurs a fault in a given round, it remains in the current node, regardless of its decision. If it planned to move and the fault happened, the agent is aware of it. We consider three scenarios of fault distribution: random (independently in each round and for each agent with constant probability 0 < p < 1), unbounded adver- sarial (the adversary can delay an agent for an arbitrary finite number of consecutive rounds) and bounded adversarial (the adversary can delay an agent for at most c consecutive rounds, where c is unknown to the agents). The quality measure of a rendezvous algorithm is its cost, which is the total number of edge traversals. For random faults, we show an algorithm with cost polynomial in the size n of the network and polylogarithmic in the larger label L, which achieves rendezvous with very high probability in arbitrary networks. By contrast, for unbounded adversarial faults we show that rendezvous is not feasible, even in the class of rings. Under this scenario we give a rendezvous algorithm with cost O(nl), where l is the smaller label, working in arbitrary trees, and we show that \Omega(l) is the lower bound on rendezvous cost, even for the two-node tree. For bounded adversarial faults, we give a rendezvous algorithm working for arbitrary networks, with cost polynomial in n, and logarithmic in the bound c and in the larger label L
    corecore