208 research outputs found
Handling Network Partitions and Mergers in Structured Overlay Networks
Structured overlay networks form a major class of peer-to-peer systems, which are touted for their abilities to
scale, tolerate failures, and self-manage. Any long-lived
Internet-scale distributed system is destined to face network partitions. Although the problem of network partitions
and mergers is highly related to fault-tolerance and
self-management in large-scale systems, it has hardly been
studied in the context of structured peer-to-peer systems.
These systems have mainly been studied under churn (frequent
joins/failures), which as a side effect solves the problem
of network partitions, as it is similar to massive node
failures. Yet, the crucial aspect of network mergers has been
ignored. In fact, it has been claimed that ring-based structured
overlay networks, which constitute the majority of the
structured overlays, are intrinsically ill-suited for merging
rings. In this paper, we present an algorithm for merging
multiple similar ring-based overlays when the underlying
network merges. We examine the solution in dynamic conditions,
showing how our solution is resilient to churn during
the merger, something widely believed to be difficult or
impossible. We evaluate the algorithm for various scenarios
and show that even when falsely detecting a merger, the
algorithm quickly terminates and does not clutter the network
with many messages. The algorithm is flexible as the
tradeoff between message complexity and time complexity
can be adjusted by a parameter
Recommended from our members
Deux: Autonomic Testing System for Operating System Upgrades
Operating system upgrades and patches sometimes break applications that worked fine on the older version. We present an autonomic approach to testing of OS updates while minimizing downtime, usable without local regression suites or IT expertise. Deux utilizes a dual-layer virtual machine architecture, with lightweight application process checkpoint and resume across OS versions, enabling simultaneous execution of the same applications on both OS versions in different VMs. Inputs provided by ordinary users to the production old version are also fed to the new version. The old OS acts as a pseudo-oracle for the update, and application state is automatically re-cloned to continue testing after any output discrepancies (intercepted at system call level) - all transparently to users. If all differences are deemed inconsequential, then the VM roles are switched with the application state already in place. Our empirical evaluation with both LAMP and standalone applications demonstrates Deux's efficiency and effectiveness
Isolation Without Taxation: {N}ear-Zero-Cost Transitions for {WebAssembly} and {SFI}
Software sandboxing or software-based fault isolation (SFI) is a lightweight
approach to building secure systems out of untrusted components. Mozilla, for
example, uses SFI to harden the Firefox browser by sandboxing third-party
libraries, and companies like Fastly and Cloudflare use SFI to safely co-locate
untrusted tenants on their edge clouds. While there have been significant
efforts to optimize and verify SFI enforcement, context switching in SFI
systems remains largely unexplored: almost all SFI systems use
\emph{heavyweight transitions} that are not only error-prone but incur
significant performance overhead from saving, clearing, and restoring registers
when context switching. We identify a set of \emph{zero-cost conditions} that
characterize when sandboxed code has sufficient structured to guarantee
security via lightweight \emph{zero-cost} transitions (simple function calls).
We modify the Lucet Wasm compiler and its runtime to use zero-cost transitions,
eliminating the undue performance tax on systems that rely on Lucet for
sandboxing (e.g., we speed up image and font rendering in Firefox by up to
29.7\% and 10\% respectively). To remove the Lucet compiler and its correct
implementation of the Wasm specification from the trusted computing base, we
(1) develop a \emph{static binary verifier}, VeriZero, which (in seconds)
checks that binaries produced by Lucet satisfy our zero-cost conditions, and
(2) prove the soundness of VeriZero by developing a logical relation that
captures when a compiled Wasm function is semantically well-behaved with
respect to our zero-cost conditions. Finally, we show that our model is useful
beyond Wasm by describing a new, purpose-built SFI system, SegmentZero32, that
uses x86 segmentation and LLVM with mostly off-the-shelf passes to enforce our
zero-cost conditions; our prototype performs on-par with the state-of-the-art
Native Client SFI system
Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources
Apache Calcite is a foundational software framework that provides query
processing, optimization, and query language support to many popular
open-source data processing systems such as Apache Hive, Apache Storm, Apache
Flink, Druid, and MapD. Calcite's architecture consists of a modular and
extensible query optimizer with hundreds of built-in optimization rules, a
query processor capable of processing a variety of query languages, an adapter
architecture designed for extensibility, and support for heterogeneous data
models and stores (relational, semi-structured, streaming, and geospatial).
This flexible, embeddable, and extensible architecture is what makes Calcite an
attractive choice for adoption in big-data frameworks. It is an active project
that continues to introduce support for the new types of data sources, query
languages, and approaches to query processing and optimization.Comment: SIGMOD'1
When Private Blockchain Meets Deterministic Database
Private blockchain as a replicated transactional system shares many
commonalities with distributed database. However, the intimacy between private
blockchain and deterministic database has never been studied. In essence,
private blockchain and deterministic database both ensure replica consistency
by determinism. In this paper, we present a comprehensive analysis to uncover
the connections between private blockchain and deterministic database. While
private blockchains have started to pursue deterministic transaction executions
recently, deterministic databases have already studied deterministic
concurrency control protocols for almost a decade. This motivates us to propose
Harmony, a novel deterministic concurrency control protocol designed for
blockchain use. We use Harmony to build a new relational blockchain, namely
HarmonyBC, which features low abort rates, hotspot resiliency, and inter-block
parallelism, all of which are especially important to disk-oriented blockchain.
Empirical results on Smallbank, YCSB, and TPC-C show that HarmonyBC offers 2.0x
to 3.5x throughput better than the state-of-the-art private blockchains
- …