Search CORE

8,980 research outputs found

An approach to achieve message efficient early-stopping uniform consensus protocols

Author: Cao J
Teo YM
Wang X
Publication venue: IEEE Computer Society
Publication date: 11/12/2014
Field of study

2003-2004 > Academic research: refereed > Refereed conference paperVersion of RecordPublishe

CiteSeerX

PolyU Institutional Repository

Lower Bounds for Achieving Synchronous Early Stopping Consensus with Orderly Crash Failures

Author: Cao Jiannong
Teo Yong Meng
Wang Xianbing
Publication venue
Publication date: 01/01/2005
Field of study

In this paper, we discuss the consensus problem for synchronous distributed systems with orderly crash failures. For a synchronous distributed system of n processes with up to t crash failures and f failures actually occur, first, we present a bivalency argument proof to solve the open problem of proving the lower bound, min (t + 1, f + 2) rounds, for early-stopping synchronous consensus with orderly crash failures, where t < n - 1. Then, we extend the system model with orderly crash failures to a new model in which a process is allowed to send multiple messages to the same destination process in a round and the failing processes still respect the order specified by the protocol in sending messages. For this new model, we present a uniform consensus protocol, in which all non-faulty processes always decide and stop immediately by the end of f + 1 rounds. We prove that the lower bound of early stopping protocols for both consensus and uniform consensus are f + 1 rounds under the new model, and our proposed protocol is optimal.Singapore-MIT Alliance (SMA

DSpace@MIT

On the Round Complexity of Randomized Byzantine Agreement

Author: Cohen Ran
Haitner Iftach
Makriyannis Nikolaos
Orland Matan
Samorodnitsky Alex
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 33rd International Symposium on Distributed Computing (DISC 2019)
Publication date: 01/01/2019
Field of study

We prove lower bounds on the round complexity of randomized Byzantine agreement (BA) protocols, bounding the halting probability of such protocols after one and two rounds. In particular, we prove that: 1) BA protocols resilient against n/3 [resp., n/4] corruptions terminate (under attack) at the end of the first round with probability at most o(1) [resp., 1/2+ o(1)]. 2) BA protocols resilient against n/4 corruptions terminate at the end of the second round with probability at most 1-Theta(1). 3) For a large class of protocols (including all BA protocols used in practice) and under a plausible combinatorial conjecture, BA protocols resilient against n/3 [resp., n/4] corruptions terminate at the end of the second round with probability at most o(1) [resp., 1/2 + o(1)]. The above bounds hold even when the parties use a trusted setup phase, e.g., a public-key infrastructure (PKI). The third bound essentially matches the recent protocol of Micali (ITCS\u2717) that tolerates up to n/3 corruptions and terminates at the end of the third round with constant probability

Dagstuhl Research Online Publication Server

In Search for an Optimal Authenticated Byzantine Agreement

Author: Spiegelman Alexander
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 35th International Symposium on Distributed Computing (DISC 2021)
Publication date: 01/01/2021
Field of study

In this paper, we challenge the conventional approach of state machine replication systems to design deterministic agreement protocols in the eventually synchronous communication model. We first prove that no such protocol can guarantee bounded communication cost before the global stabilization time and propose a different approach that hopes for the best (synchrony) but prepares for the worst (asynchrony). Accordingly, we design an optimistic byzantine agreement protocol that first tries an efficient deterministic algorithm that relies on synchrony for termination only, and then, only if an agreement was not reached due to asynchrony, the protocol uses a randomized asynchronous protocol for fallback that guarantees termination with probability 1. We formally prove that our protocol achieves optimal communication complexity under all network conditions and failure scenarios. We first prove a lower bound of ?(ft+ t) for synchronous deterministic byzantine agreement protocols, where t is the failure threshold, and f is the actual number of failures. Then, we present a tight upper bound and use it for the synchronous part of the optimistic protocol. Finally, for the asynchronous fallback, we use a variant of the (optimal) VABA protocol, which we reconstruct to safely combine it with the synchronous part. We believe that our adaptive to failures synchronous byzantine agreement protocol has an independent interest since it is the first protocol we are aware of which communication complexity optimally depends on the actual number of failures

Dagstuhl Research Online Publication Server

Distributed Exploration in Multi-Armed Bandits

Author: Hillel Eshcar
Karnin Zohar
Koren Tomer
Lempel Ronny
Somekh Oren
Publication venue
Publication date: 04/11/2013
Field of study

We study exploration in Multi-Armed Bandits in a setting where

k

players collaborate in order to identify an

\epsilon

-optimal arm. Our motivation comes from recent employment of bandit algorithms in computationally intensive, large-scale applications. Our results demonstrate a non-trivial tradeoff between the number of arm pulls required by each of the players, and the amount of communication between them. In particular, our main result shows that by allowing the

k

players to communicate only once, they are able to learn

\sqrt{k}

times faster than a single player. That is, distributing learning to

k

players gives rise to a factor

\sqrt{k}

parallel speed-up. We complement this result with a lower bound showing this is in general the best possible. On the other extreme, we present an algorithm that achieves the ideal factor

k

speed-up in learning performance, with communication only logarithmic in

1/\epsilon

arXiv.org e-Print Archive

CiteSeerX

Building global and scalable systems with atomic multicast

Author: Benz Samuel
Pedone Fernando
Publication venue
Publication date: 06/04/2018
Field of study

The rise of worldwide Internet-scale services demands large distributed systems. Indeed, when handling several millions of users, it is common to operate thousands of servers spread across the globe. Here, replication plays a central role, as it contributes to improve the user experience by hiding failures and by providing acceptable latency. In this thesis, we claim that atomic multicast, with strong and well-defined properties, is the appropriate abstraction to efficiently design and implement globally scalable distributed systems. Internet-scale services rely on data partitioning and replication to provide scalable performance and high availability. Moreover, to reduce user-perceived response times and tolerate disasters (i.e., the failure of a whole datacenter), services are increasingly becoming geographically distributed. Data partitioning and replication, combined with local and geographical distribution, introduce daunting challenges, including the need to carefully order requests among replicas and partitions. One way to tackle this problem is to use group communication primitives that encapsulate order requirements. While replication is a common technique used to design such reliable distributed systems, to cope with the requirements of modern cloud based ``always-on'' applications, replication protocols must additionally allow for throughput scalability and dynamic reconfiguration, that is, on-demand replacement or provisioning of system resources. We propose a dynamic atomic multicast protocol which fulfills these requirements. It allows to dynamically add and remove resources to an online replicated state machine and to recover crashed processes. Major efforts have been spent in recent years to improve the performance, scalability and reliability of distributed systems. In order to hide the complexity of designing distributed applications, many proposals provide efficient high-level communication abstractions. Since the implementation of a production-ready system based on this abstraction is still a major task, we further propose to expose our protocol to developers in the form of distributed data structures. B-trees for example, are commonly used in different kinds of applications, including database indexes or file systems. Providing a distributed, fault-tolerant and scalable data structure would help developers to integrate their applications in a distribution transparent manner. This work describes how to build reliable and scalable distributed systems based on atomic multicast and demonstrates their capabilities by an implementation of a distributed ordered map that supports dynamic re-partitioning and fast recovery. To substantiate our claim, we ported an existing SQL database atop of our distributed lock-free data structure. Here, replication plays a central role, as it contributes to improve the user experience by hiding failures and by providing acceptable latency. In this thesis, we claim that atomic multicast, with strong and well-defined properties, is the appropriate abstraction to efficiently design and implement globally scalable distributed systems. Internet-scale services rely on data partitioning and replication to provide scalable performance and high availability. Moreover, to reduce user-perceived response times and tolerate disasters (i.e., the failure of a whole datacenter), services are increasingly becoming geographically distributed. Data partitioning and replication, combined with local and geographical distribution, introduce daunting challenges, including the need to carefully order requests among replicas and partitions. One way to tackle this problem is to use group communication primitives that encapsulate order requirements. While replication is a common technique used to design such reliable distributed systems, to cope with the requirements of modern cloud based ``always-on'' applications, replication protocols must additionally allow for throughput scalability and dynamic reconfiguration, that is, on-demand replacement or provisioning of system resources. We propose a dynamic atomic multicast protocol which fulfills these requirements. It allows to dynamically add and remove resources to an online replicated state machine and to recover crashed processes. Major efforts have been spent in recent years to improve the performance, scalability and reliability of distributed systems. In order to hide the complexity of designing distributed applications, many proposals provide efficient high-level communication abstractions. Since the implementation of a production-ready system based on this abstraction is still a major task, we further propose to expose our protocol to developers in the form of distributed data structures. B- trees for example, are commonly used in different kinds of applications, including database indexes or file systems. Providing a distributed, fault-tolerant and scalable data structure would help developers to integrate their applications in a distribution transparent manner. This work describes how to build reliable and scalable distributed systems based on atomic multicast and demonstrates their capabilities by an implementation of a distributed ordered map that supports dynamic re-partitioning and fast recovery. To substantiate our claim, we ported an existing SQL database atop of our distributed lock-free data structure

RERO DOC Digital Library