849 research outputs found
Self-Stabilization, Byzantine Containment, and Maximizable Metrics: Necessary Conditions
Self-stabilization is a versatile approach to fault-tolerance since it
permits a distributed system to recover from any transient fault that
arbitrarily corrupts the contents of all memories in the system. Byzantine
tolerance is an attractive feature of distributed systems that permits to cope
with arbitrary malicious behaviors. We consider the well known problem of
constructing a maximum metric tree in this context. Combining these two
properties leads to some impossibility results. In this paper, we provide two
necessary conditions to construct maximum metric tree in presence of transients
and (permanent) Byzantine faults
Bounding the Impact of Unbounded Attacks in Stabilization
Self-stabilization is a versatile approach to fault-tolerance since it
permits a distributed system to recover from any transient fault that
arbitrarily corrupts the contents of all memories in the system. Byzantine
tolerance is an attractive feature of distributed systems that permits to cope
with arbitrary malicious behaviors. Combining these two properties proved
difficult: it is impossible to contain the spatial impact of Byzantine nodes in
a self-stabilizing context for global tasks such as tree orientation and tree
construction. We present and illustrate a new concept of Byzantine containment
in stabilization. Our property, called Strong Stabilization enables to contain
the impact of Byzantine nodes if they actually perform too many Byzantine
actions. We derive impossibility results for strong stabilization and present
strongly stabilizing protocols for tree orientation and tree construction that
are optimal with respect to the number of Byzantine nodes that can be tolerated
in a self-stabilizing context
Auditable Restoration of Distributed Programs
We focus on a protocol for auditable restoration of distributed systems. The
need for such protocol arises due to conflicting requirements (e.g., access to
the system should be restricted but emergency access should be provided). One
can design such systems with a tamper detection approach (based on the
intuition of "break the glass door"). However, in a distributed system, such
tampering, which are denoted as auditable events, is visible only for a single
node. This is unacceptable since the actions they take in these situations can
be different than those in the normal mode. Moreover, eventually, the auditable
event needs to be cleared so that system resumes the normal operation.
With this motivation, in this paper, we present a protocol for auditable
restoration, where any process can potentially identify an auditable event.
Whenever a new auditable event occurs, the system must reach an "auditable
state" where every process is aware of the auditable event. Only after the
system reaches an auditable state, it can begin the operation of restoration.
Although any process can observe an auditable event, we require that only
"authorized" processes can begin the task of restoration. Moreover, these
processes can begin the restoration only when the system is in an auditable
state. Our protocol is self-stabilizing and has bounded state space. It can
effectively handle the case where faults or auditable events occur during the
restoration protocol. Moreover, it can be used to provide auditable restoration
to other distributed protocol.Comment: 10 page
Reliable Communication in a Dynamic Network in the Presence of Byzantine Faults
We consider the following problem: two nodes want to reliably communicate in
a dynamic multihop network where some nodes have been compromised, and may have
a totally arbitrary and unpredictable behavior. These nodes are called
Byzantine. We consider the two cases where cryptography is available and not
available. We prove the necessary and sufficient condition (that is, the
weakest possible condition) to ensure reliable communication in this context.
Our proof is constructive, as we provide Byzantine-resilient algorithms for
reliable communication that are optimal with respect to our impossibility
results. In a second part, we investigate the impact of our conditions in three
case studies: participants interacting in a conference, robots moving on a grid
and agents in the subway. Our simulations indicate a clear benefit of using our
algorithms for reliable communication in those contexts
A Scalable Byzantine Grid
Modern networks assemble an ever growing number of nodes. However, it remains
difficult to increase the number of channels per node, thus the maximal degree
of the network may be bounded. This is typically the case in grid topology
networks, where each node has at most four neighbors. In this paper, we address
the following issue: if each node is likely to fail in an unpredictable manner,
how can we preserve some global reliability guarantees when the number of nodes
keeps increasing unboundedly ? To be more specific, we consider the problem or
reliably broadcasting information on an asynchronous grid in the presence of
Byzantine failures -- that is, some nodes may have an arbitrary and potentially
malicious behavior. Our requirement is that a constant fraction of correct
nodes remain able to achieve reliable communication. Existing solutions can
only tolerate a fixed number of Byzantine failures if they adopt a worst-case
placement scheme. Besides, if we assume a constant Byzantine ratio (each node
has the same probability to be Byzantine), the probability to have a fatal
placement approaches 1 when the number of nodes increases, and reliability
guarantees collapse. In this paper, we propose the first broadcast protocol
that overcomes these difficulties. First, the number of Byzantine failures that
can be tolerated (if they adopt the worst-case placement) now increases with
the number of nodes. Second, we are able to tolerate a constant Byzantine
ratio, however large the grid may be. In other words, the grid becomes
scalable. This result has important security applications in ultra-large
networks, where each node has a given probability to misbehave.Comment: 17 page
Parameterizable Byzantine Broadcast in Loosely Connected Networks
We consider the problem of reliably broadcasting information in a multihop
asynchronous network, despite the presence of Byzantine failures: some nodes
are malicious and behave arbitrarly. We focus on non-cryptographic solutions.
Most existing approaches give conditions for perfect reliable broadcast (all
correct nodes deliver the good information), but require a highly connected
network. A probabilistic approach was recently proposed for loosely connected
networks: the Byzantine failures are randomly distributed, and the correct
nodes deliver the good information with high probability. A first solution
require the nodes to initially know their position on the network, which may be
difficult or impossible in self-organizing or dynamic networks. A second
solution relaxed this hypothesis but has much weaker Byzantine tolerance
guarantees. In this paper, we propose a parameterizable broadcast protocol that
does not require nodes to have any knowledge about the network. We give a
deterministic technique to compute a set of nodes that always deliver authentic
information, for a given set of Byzantine failures. Then, we use this technique
to experimentally evaluate our protocol, and show that it significantely
outperforms previous solutions with the same hypotheses. Important disclaimer:
these results have NOT yet been published in an international conference or
journal. This is just a technical report presenting intermediary and incomplete
results. A generalized version of these results may be under submission
Rapid Recovery for Systems with Scarce Faults
Our goal is to achieve a high degree of fault tolerance through the control
of a safety critical systems. This reduces to solving a game between a
malicious environment that injects failures and a controller who tries to
establish a correct behavior. We suggest a new control objective for such
systems that offers a better balance between complexity and precision: we seek
systems that are k-resilient. In order to be k-resilient, a system needs to be
able to rapidly recover from a small number, up to k, of local faults
infinitely many times, provided that blocks of up to k faults are separated by
short recovery periods in which no fault occurs. k-resilience is a simple but
powerful abstraction from the precise distribution of local faults, but much
more refined than the traditional objective to maximize the number of local
faults. We argue why we believe this to be the right level of abstraction for
safety critical systems when local faults are few and far between. We show that
the computational complexity of constructing optimal control with respect to
resilience is low and demonstrate the feasibility through an implementation and
experimental results.Comment: In Proceedings GandALF 2012, arXiv:1210.202
On Byzantine Broadcast in Loosely Connected Networks
We consider the problem of reliably broadcasting information in a multihop
asynchronous network that is subject to Byzantine failures. Most existing
approaches give conditions for perfect reliable broadcast (all correct nodes
deliver the authentic message and nothing else), but they require a highly
connected network. An approach giving only probabilistic guarantees (correct
nodes deliver the authentic message with high probability) was recently
proposed for loosely connected networks, such as grids and tori. Yet, the
proposed solution requires a specific initialization (that includes global
knowledge) of each node, which may be difficult or impossible to guarantee in
self-organizing networks - for instance, a wireless sensor network, especially
if they are prone to Byzantine failures. In this paper, we propose a new
protocol offering guarantees for loosely connected networks that does not
require such global knowledge dependent initialization. In more details, we
give a methodology to determine whether a set of nodes will always deliver the
authentic message, in any execution. Then, we give conditions for perfect
reliable broadcast in a torus network. Finally, we provide experimental
evaluation for our solution, and determine the number of randomly distributed
Byzantine failures than can be tolerated, for a given correct broadcast
probability.Comment: 1
- …