74 research outputs found
Preserving Stabilization while Practically Bounding State Space
Stabilization is a key dependability property for dealing with unanticipated
transient faults, as it guarantees that even in the presence of such faults,
the system will recover to states where it satisfies its specification. One of
the desirable attributes of stabilization is the use of bounded space for each
variable. In this paper, we present an algorithm that transforms a stabilizing
program that uses variables with unbounded domain into a stabilizing program
that uses bounded variables and (practically bounded) physical time. While
non-stabilizing programs (that do not handle transient faults) can deal with
unbounded variables by assigning large enough but bounded space, stabilizing
programs that need to deal with arbitrary transient faults cannot do the same
since a transient fault may corrupt the variable to its maximum value. We show
that our transformation algorithm is applicable to several problems including
logical clocks, vector clocks, mutual exclusion, leader election, diffusing
computations, Paxos based consensus, and so on. Moreover, our approach can also
be used to bound counters used in an earlier work by Katz and Perry for adding
stabilization to a non-stabilizing program. By combining our algorithm with
that earlier work by Katz and Perry, it would be possible to provide
stabilization for a rich class of problems, by assigning large enough but
bounded space for variables.Comment: Moved some content from the Appendix to the main paper, added some
details to the transformation algorithm and to its descriptio
Auditable Restoration of Distributed Programs
We focus on a protocol for auditable restoration of distributed systems. The
need for such protocol arises due to conflicting requirements (e.g., access to
the system should be restricted but emergency access should be provided). One
can design such systems with a tamper detection approach (based on the
intuition of "break the glass door"). However, in a distributed system, such
tampering, which are denoted as auditable events, is visible only for a single
node. This is unacceptable since the actions they take in these situations can
be different than those in the normal mode. Moreover, eventually, the auditable
event needs to be cleared so that system resumes the normal operation.
With this motivation, in this paper, we present a protocol for auditable
restoration, where any process can potentially identify an auditable event.
Whenever a new auditable event occurs, the system must reach an "auditable
state" where every process is aware of the auditable event. Only after the
system reaches an auditable state, it can begin the operation of restoration.
Although any process can observe an auditable event, we require that only
"authorized" processes can begin the task of restoration. Moreover, these
processes can begin the restoration only when the system is in an auditable
state. Our protocol is self-stabilizing and has bounded state space. It can
effectively handle the case where faults or auditable events occur during the
restoration protocol. Moreover, it can be used to provide auditable restoration
to other distributed protocol.Comment: 10 page
Non-equilibrium steady states of stochastic processes with intermittent resetting
Stochastic processes that are randomly reset to an initial condition serve as
a showcase to investigate non-equilibrium steady states. However, all existing
results have been restricted to the special case of memoryless resetting
protocols. Here, we obtain the general solution for the distribution of
processes in which waiting times between reset events are drawn from an
arbitrary distribution. This allows for the investigation of a broader class of
much more realistic processes. As an example, our results are applied to the
analysis of the efficiency of constrained random search processes.Comment: 5 pages, 4 figure
Separation of Circulating Tokens
Self-stabilizing distributed control is often modeled by token abstractions.
A system with a single token may implement mutual exclusion; a system with
multiple tokens may ensure that immediate neighbors do not simultaneously enjoy
a privilege. For a cyber-physical system, tokens may represent physical objects
whose movement is controlled. The problem studied in this paper is to ensure
that a synchronous system with m circulating tokens has at least d distance
between tokens. This problem is first considered in a ring where d is given
whilst m and the ring size n are unknown. The protocol solving this problem can
be uniform, with all processes running the same program, or it can be
non-uniform, with some processes acting only as token relays. The protocol for
this first problem is simple, and can be expressed with Petri net formalism. A
second problem is to maximize d when m is given, and n is unknown. For the
second problem, the paper presents a non-uniform protocol with a single
corrective process.Comment: 22 pages, 7 figures, epsf and pstricks in LaTe
Brief announcement: incremental component-based modeling, verification, and performance evaluation of distributed reset
In this work, we apply a methodology which consistently integrates modeling, verification, and performance evaluation techniques, based on the BIP (Behavior, Interaction, Priority) component framework developed at Verimag [A. Basu et al., 2006; A. Basu et al., 2008]. BIP is based on a semantic model encompassing composition of heterogeneous components. The distributed semantics of BIP allows generating from a high-level component-based model in BIP an observationally equivalent distributed implementation [A. Basu et al., 2008]. BIP uses two families of composition operators for expressing coordination between components: interactions and priorities. Interactions may involve multiple components (unlike traditional point-to-point formalisms) and are expressed by combining two protocols: rendezvous and broadcast. We note that addition of interactions among components adds no extra behaviors
An Adaptive Policy Management Approach to BGP Convergence
The Border Gateway Protocol (BGP) is the current inter-domain routing protocol used to exchange reachability information between Autonomous Systems (ASes) in the Internet. BGP supports policy-based routing which allows each AS to independently adopt a set of local policies that specify which routes it accepts and advertises from/to other networks, as well as which route it prefers when more than one route becomes available. However, independently chosen local policies may cause global conflicts, which result in protocol divergence. In this paper, we propose a new algorithm, called Adaptive Policy Management Scheme (APMS), to resolve policy conflicts in a distributed manner. Akin to distributed feedback control systems, each AS independently classifies the state of the network as either conflict-free or potentially-conflicting by observing its local history only (namely, route flaps). Based on the degree of measured conflicts (policy conflict-avoidance vs. -control mode), each AS dynamically adjusts its own path preferences—increasing its preference for observably stable paths over flapping paths. APMS also includes a mechanism to distinguish route flaps due to topology changes, so as not to confuse them with those due to policy conflicts. A correctness and convergence analysis of APMS based on the substability property of chosen paths is presented. Implementation in the SSF network simulator is performed, and simulation results for different performance metrics are presented. The metrics capture the dynamic performance (in terms of instantaneous throughput, delay, routing load, etc.) of APMS and other competing solutions, thus exposing the often neglected aspects of performance.National Science Foundation (ANI-0095988, EIA-0202067, ITR ANI-0205294
An automated wrapper-based approach to the design of dependable software
The design of dependable software systems invariably comprises two main activities: (i) the design of dependability mechanisms, and (ii) the location of dependability mechanisms. It has been shown that these activities are intrinsically difficult. In this paper we propose an automated wrapper-based methodology to circumvent the problems associated with the design and location of dependability mechanisms. To achieve this we replicate important variables so that they can be used as part of standard, efficient dependability mechanisms. These well-understood mechanisms are then deployed in all relevant locations. To validate the proposed methodology we apply it to three complex software systems, evaluating the dependability enhancement and execution overhead in each case. The results generated demonstrate that the system failure rate of a wrapped software system can be several orders of magnitude lower than that of an unwrapped equivalent
First-passage time of a Brownian searcher with stochastic resetting to random positions
We study the effect of a resetting point randomly distributed around the
origin on the mean first passage time of a Brownian searcher moving in one
dimension. We compare the search efficiency with that corresponding to reset to
the origin and find that the mean first passage time of the latter can be
larger or smaller than the distributed case, depending on whether the resetting
points are symmetrically or asymmetrically distributed. In particular, we prove
the existence of an optimal reset rate that minimizes the mean first-passage
time for distributed resetting to a finite interval if the target is located
outside this interval. When the target position belongs to the resetting
interval or it is infinite then no optimal reset rate exists, but there is an
optimal resetting interval width or resetting characteristic scale which
minimizes the mean first-passage time. We also show that the first-passage
density averaged over the resetting points depends on its first moment only. As
a consequence, there is an equivalent point such that the first-passage problem
with resetting to that point is statistically equivalent to the case of
distributed resetting. We end our study by analyzing the fluctuations of the
first-passage times for these cases. All our analytical results are verified
through numerical simulations.Comment: 13 pages, 15 figure
Self-stabilizing tree algorithms
Designers of distributed algorithms have to contend with the problem of making the algorithms tolerant to several forms of coordination loss, primarily faulty initialization. The processes in a distributed system do not share a global memory and can only get a partial view of the global state. Transient failures in one part of the system may go unnoticed in other parts and thus cause the system to go into an illegal state. If the system were self-stabilizing, however, it is guaranteed that it will return to a legal state after a finite number of state transitions. This thesis presents and proves self-stabilizing algorithms for calculating tree metrics and for achieving mutual exclusion on a tree structured distributed system
- …