925 research outputs found

    A Superstabilizing log(n)\log(n)-Approximation Algorithm for Dynamic Steiner Trees

    Get PDF
    In this paper we design and prove correct a fully dynamic distributed algorithm for maintaining an approximate Steiner tree that connects via a minimum-weight spanning tree a subset of nodes of a network (referred as Steiner members or Steiner group) . Steiner trees are good candidates to efficiently implement communication primitives such as publish/subscribe or multicast, essential building blocks for the new emergent networks (e.g. P2P, sensor or adhoc networks). The cost of the solution returned by our algorithm is at most logS\log |S| times the cost of an optimal solution, where SS is the group of members. Our algorithm improves over existing solutions in several ways. First, it tolerates the dynamism of both the group members and the network. Next, our algorithm is self-stabilizing, that is, it copes with nodes memory corruption. Last but not least, our algorithm is \emph{superstabilizing}. That is, while converging to a correct configuration (i.e., a Steiner tree) after a modification of the network, it keeps offering the Steiner tree service during the stabilization time to all members that have not been affected by this modification

    Anonymous Asynchronous Systems: The Case of Failure Detectors

    Get PDF
    Due the multiplicity of loci of control, a main issue distributed systems have to cope with lies in the uncertainty on the system state created by the adversaries that are asynchrony, failures, dynamicity, mobility, etc. Considering message-passing systems, this paper considers the uncertainty created by the net effect of three of these adversaries, namely, asynchrony, failures, and anonymity. This means that, in addition to be asynchronous and crash-prone, the processes have no identity. Trivially, agreement problems (e.g., consensus) that cannot be solved in presence of asynchrony and failures cannot be solved either when adding anonymity. The paper consequently proposes anonymous failure detectors to circumvent these impossibilities. It has several contributions. First it presents three classes of failure detectors (denoted AP, A∩ and A∑) and show that they are the anonymous counterparts of the classes of perfect failure detectors, eventual leader failure detectors and quorum failure detectors, respectively. The class A∑ is new and showing it is the anonymous counterpart of the class ∑ is not trivial. Then, the paper presents and proves correct a genuinely anonymous consensus algorithm based on the pair of anonymous failure detector classes (A∩, A∑) (“genuinely” means that, not only processes have no identity, but no process is aware of the total number of processes). This new algorithm is not a “straightforward extension” of an algorithm designed for non-anonymous systems. To benefit from A∑, it uses a novel message exchange pattern where each phase of every round is made up of sub-rounds in which appropriate control information is exchanged. Finally, the paper discusses the notions of failure detector class hierarchy and weakest failure detector class for a given problem in the context of anonymous systems

    Charge distribution in two-dimensional electrostatics

    Full text link
    We examine the stability of ringlike configurations of N charges on a plane interacting through the potential V(z1,...,zN)=izi2i<jlnzizj2V(z_1,...,z_N)=\sum_i |z_i|^2-\sum_{i<j} ln|z_i-z_j|^2. We interpret the equilibrium distributions in terms of a shell model and compare predictions of the model with the results of numerical simulations for systems with up to 100 particles.Comment: LaTe

    Solving atomic multicast when groups crash

    Get PDF
    In this paper, we study the atomic multicast problem, a fundamental abstraction for building faulttolerant systems. In the atomic multicast problem, the system is divided into non-empty and disjoint groups of processes. Multicast messages may be addressed to any subset of groups, each message possibly being multicast to a different subset. Several papers previously studied this problem either in local area networks [3, 9, 20] or wide area networks [13, 21]. However, none of them considered atomic multicast when groups may crash. We present two atomic multicast algorithms that tolerate the crash of groups. The first algorithm tolerates an arbitrary number of failures, is genuine (i.e., to deliver a message m, only addressees of m are involved in the protocol), and uses the perfect failures detector P. We show that among realistic failure detectors, i.e., those that do not predict the future, P is necessary to solve genuine atomic multicast if we do not bound the number of processes that may fail. Thus, P is the weakest realistic failure detector for solving genuine atomic multicast when an arbitrary number of processes may crash. Our second algorithm is non-genuine and less resilient to process failures than the first algorithm but has several advantages: (i) it requires perfect failure detection within groups only, and not across the system, (ii) as we show in the paper it can be modified to rely on unreliable failure detection at the cost of a weaker liveness guarantee, and (iii) it is fast, messages addressed to multiple groups may be delivered within two inter-group message delays only

    Experience with the LHC beam dump post-operational checks system

    Get PDF
    After each beam dump in the LHC automatic post-operational checks are made to guarantee that the last beam dump has been executed correctly and that the system can be declared to be ‘as good as new’ before the next injection is allowed. The analysis scope comprises the kicker waveforms, redundancy in kicker generator signal paths and different beam instrumentation measurements. This paper describes the implementation and the operational experience of the internal and external post-operational checks of the LHC beam dumping system during the commissioning of the LHC without beam and during the first days of beam operation

    Classification in sparse, high dimensional environments applied to distributed systems failure prediction

    Get PDF
    Network failures are still one of the main causes of distributed systems’ lack of reliability. To overcome this problem we present an improvement over a failure prediction system, based on Elastic Net Logistic Regression and the application of rare events prediction techniques, able to work with sparse, high dimensional datasets. Specifically, we prove its stability, fine tune its hyperparameter and improve its industrial utility by showing that, with a slight change in dataset creation, it can also predict the location of a failure, a key asset when trying to take a proactive approach to failure management
    corecore