52,087 research outputs found
Coordination-Free Byzantine Replication with Minimal Communication Costs
State-of-the-art fault-tolerant and federated data management systems rely on fully-replicated designs in which all participants have equivalent roles. Consequently, these systems have only limited scalability and are ill-suited for high-performance data management. As an alternative, we propose a hierarchical design in which a Byzantine cluster manages data, while an arbitrary number of learners can reliable learn these updates and use the corresponding data.
To realize our design, we propose the delayed-replication algorithm, an efficient solution to the Byzantine learner problem that is central to our design. The delayed-replication algorithm is coordination-free, scalable, and has minimal communication cost for all participants involved. In doing so, the delayed-broadcast algorithm opens the door to new high-performance fault-tolerant and federated data management systems. To illustrate this, we show that the delayed-replication algorithm is not only useful to support specialized learners, but can also be used to reduce the overall communication cost of permissioned blockchains and to improve their storage scalability
Scalable Routing Easy as PIE: a Practical Isometric Embedding Protocol (Technical Report)
We present PIE, a scalable routing scheme that achieves 100% packet delivery
and low path stretch. It is easy to implement in a distributed fashion and
works well when costs are associated to links. Scalability is achieved by using
virtual coordinates in a space of concise dimensionality, which enables greedy
routing based only on local knowledge. PIE is a general routing scheme, meaning
that it works on any graph. We focus however on the Internet, where routing
scalability is an urgent concern. We show analytically and by using simulation
that the scheme scales extremely well on Internet-like graphs. In addition, its
geometric nature allows it to react efficiently to topological changes or
failures by finding new paths in the network at no cost, yielding better
delivery ratios than standard algorithms. The proposed routing scheme needs an
amount of memory polylogarithmic in the size of the network and requires only
local communication between the nodes. Although each node constructs its
coordinates and routes packets locally, the path stretch remains extremely low,
even lower than for centralized or less scalable state-of-the-art algorithms:
PIE always finds short paths and often enough finds the shortest paths.Comment: This work has been previously published in IEEE ICNP'11. The present
document contains an additional optional mechanism, presented in Section
III-D, to further improve performance by using route asymmetry. It also
contains new simulation result
Lock-free Concurrent Data Structures
Concurrent data structures are the data sharing side of parallel programming.
Data structures give the means to the program to store data, but also provide
operations to the program to access and manipulate these data. These operations
are implemented through algorithms that have to be efficient. In the sequential
setting, data structures are crucially important for the performance of the
respective computation. In the parallel programming setting, their importance
becomes more crucial because of the increased use of data and resource sharing
for utilizing parallelism.
The first and main goal of this chapter is to provide a sufficient background
and intuition to help the interested reader to navigate in the complex research
area of lock-free data structures. The second goal is to offer the programmer
familiarity to the subject that will allow her to use truly concurrent methods.Comment: To appear in "Programming Multi-core and Many-core Computing
Systems", eds. S. Pllana and F. Xhafa, Wiley Series on Parallel and
Distributed Computin
Keeping Authorities "Honest or Bust" with Decentralized Witness Cosigning
The secret keys of critical network authorities - such as time, name,
certificate, and software update services - represent high-value targets for
hackers, criminals, and spy agencies wishing to use these keys secretly to
compromise other hosts. To protect authorities and their clients proactively
from undetected exploits and misuse, we introduce CoSi, a scalable witness
cosigning protocol ensuring that every authoritative statement is validated and
publicly logged by a diverse group of witnesses before any client will accept
it. A statement S collectively signed by W witnesses assures clients that S has
been seen, and not immediately found erroneous, by those W observers. Even if S
is compromised in a fashion not readily detectable by the witnesses, CoSi still
guarantees S's exposure to public scrutiny, forcing secrecy-minded attackers to
risk that the compromise will soon be detected by one of the W witnesses.
Because clients can verify collective signatures efficiently without
communication, CoSi protects clients' privacy, and offers the first
transparency mechanism effective against persistent man-in-the-middle attackers
who control a victim's Internet access, the authority's secret key, and several
witnesses' secret keys. CoSi builds on existing cryptographic multisignature
methods, scaling them to support thousands of witnesses via signature
aggregation over efficient communication trees. A working prototype
demonstrates CoSi in the context of timestamping and logging authorities,
enabling groups of over 8,000 distributed witnesses to cosign authoritative
statements in under two seconds.Comment: 20 pages, 7 figure
On the design and implementation of broadcast and global combine operations using the postal model
There are a number of models that were proposed in recent years for message passing parallel systems. Examples are the postal model and its generalization the LogP model. In the postal model a parameter λ is used to model the communication latency of the message-passing system. Each node during each round can send a fixed-size message and, simultaneously, receive a message of the same size. Furthermore, a message sent out during round r will incur a latency of hand will arrive at the receiving node at round r + λ - 1.
Our goal in this paper is to bridge the gap between the theoretical modeling and the practical implementation. In particular, we investigate a number of practical issues related to the design and implementation of two collective communication operations, namely, the broadcast operation and the global combine operation. Those practical issues include, for example, 1) techniques for measurement of the value of λ on a given machine, 2) creating efficient broadcast algorithms that get the latency hand the number of nodes n as parameters and 3) creating efficient global combine algorithms for parallel machines with λ which is not an integer. We propose solutions that address those practical issues and present results of an experimental study of the new algorithms on the Intel Delta machine. Our main conclusion is that the postal model can help in performance prediction and tuning, for example, a properly tuned broadcast improves the known implementation by more than 20%
CRDTs: Consistency without concurrency control
A CRDT is a data type whose operations commute when they are concurrent.
Replicas of a CRDT eventually converge without any complex concurrency control.
As an existence proof, we exhibit a non-trivial CRDT: a shared edit buffer
called Treedoc. We outline the design, implementation and performance of
Treedoc. We discuss how the CRDT concept can be generalised, and its
limitations
Towards a Scalable Dynamic Spatial Database System
With the rise of GPS-enabled smartphones and other similar mobile devices,
massive amounts of location data are available. However, no scalable solutions
for soft real-time spatial queries on large sets of moving objects have yet
emerged. In this paper we explore and measure the limits of actual algorithms
and implementations regarding different application scenarios. And finally we
propose a novel distributed architecture to solve the scalability issues.Comment: (2012
- …