26 research outputs found
Toward Domain-Specific Solvers for Distributed Consistency
To guard against machine failures, modern internet services store multiple replicas of the same application data within and across data centers, which introduces the problem of keeping geo-distributed replicas consistent with one another in the face of network partitions and unpredictable message latency. To avoid costly and conservative synchronization protocols, many real-world systems provide only weak consistency guarantees (e.g., eventual, causal, or PRAM consistency), which permit certain kinds of disagreement among replicas.
There has been much recent interest in language support for specifying and verifying such consistency properties. Although these properties are usually beyond the scope of what traditional type checkers or compiler analyses can guarantee, solver-aided languages are up to the task. Inspired by systems like Liquid Haskell [Vazou et al., 2014] and Rosette [Torlak and Bodik, 2014], we believe that close integration between a language and a solver is the right path to consistent-by-construction distributed applications. Unfortunately, verifying distributed consistency properties requires reasoning about transitive relations (e.g., causality or happens-before), partial orders (e.g., the lattice of replica states under a convergent merge operation), and properties relevant to message processing or API invocation (e.g., commutativity and idempotence) that cannot be easily or efficiently carried out by general-purpose SMT solvers that lack native support for this kind of reasoning.
We argue that domain-specific SMT-based tools that exploit the mathematical foundations of distributed consistency would enable both more efficient verification and improved ease of use for domain experts. The principle of exploiting domain knowledge for efficiency and expressivity that has borne fruit elsewhere - such as in the development of high-performance domain-specific languages that trade off generality to gain both performance and productivity - also applies here. Languages augmented with domain-specific, consistency-aware solvers would support the rapid implementation of formally verified programming abstractions that guarantee distributed consistency. In the long run, we aim to democratize the development of such domain-specific solvers by creating a framework for domain-specific solver development that brings new theory solver implementation within the reach of programmers who are not necessarily SMT solver internals experts
An Exceptional Actor System (Functional Pearl)
The Glasgow Haskell Compiler is known for its feature-laden runtime system
(RTS), which includes lightweight threads, asynchronous exceptions, and a slew
of other features. Their combination is powerful enough that a programmer may
complete the same task in many different ways -- some more advisable than
others.
We present a user-accessible actor framework hidden in plain sight within the
RTS and demonstrate it on a classic example from the distributed systems
literature. We then extend both the framework and example to the realm of
dynamic types. Finally, we raise questions about how RTS features intersect and
possibly subsume one another, and suggest that GHC can guide good practice by
constraining the use of some features.Comment: To appear at Haskell Symposium 202
Deterministic Threshold Queries of Distributed Data Structures
Abstract. Convergent replicated data types, or CvRDTs, are latticebased data structures for enforcing the eventual consistency of replicated objects in a distributed system. Although CvRDTs are provably eventually consistent, queries of CvRDTs nevertheless allow inconsistent intermediate states of replicas to be observed; and although in practice, many systems allow a mix of eventually consistent and strongly consistent queries, CvRDTs only support the former. Taking inspiration from our previous work on LVars for deterministic parallel programming, we show how to extend CvRDTs to support deterministic, strongly consistent queries using a mechanism called threshold queries. The threshold query technique generalizes to any lattice, and hence any CvRDT, and allows deterministic observations to be made of replicated objects before the replicas ’ states have converged
Inductive diagrams for causal reasoning
The Lamport diagram is a pervasive and intuitive tool for informal reasoning
about causality in a concurrent system. However, traditional axiomatic
formalizations of Lamport diagrams can be painful to work with in a mechanized
setting like Agda, whereas inductively-defined data would enjoy structural
induction and automatic normalization. We propose an alternative, inductive
formalization -- the causal separation diagram (CSD) -- that takes inspiration
from string diagrams and concurrent separation logic. CSDs enjoy a graphical
syntax similar to Lamport diagrams, and can be given compositional semantics in
a variety of domains. We demonstrate the utility of CSDs by applying them to
logical clocks -- widely-used mechanisms for reifying causal relationships as
data -- yielding a generic proof of Lamport's clock condition that is
parametric in a choice of clock. We instantiate this proof on Lamport's scalar
clock, on Mattern's vector clock, and on the matrix clocks of Raynal et al. and
of Wuu and Bernstein, yielding verified implementations of each. Our results
and general framework are mechanized in the Agda proof assistant
Portable, Efficient, and Practical Library-Level Choreographic Programming
Choreographic programming (CP) is an emerging paradigm for programming
distributed applications that run on multiple nodes. In CP, the programmer
writes one program, called a choreography, that is then transformed to
individual programs for each node via a compilation step called endpoint
projection (EPP). While CP languages have existed for over a decade,
library-level CP -- in which choreographies are expressed as programs in an
existing host language, and choreographic language constructs and EPP are
provided entirely by a host-language library -- is in its infancy.
Library-level CP has great potential, but existing implementations have
portability, efficiency, and practicality drawbacks that hinder its adoption.
In this paper, we aim to advance the state of the art of library-level CP
with two novel techniques for choreographic library design and implementation:
endpoint projection as dependency injection (EPP-as-DI), and choreographic
enclaves. EPP-as-DI is a language-agnostic technique for implementing EPP at
the library level. Unlike existing library-level approaches, EPP-as-DI asks
little from the host language -- support for higher-order functions is all that
is required -- making it usable in a wide variety of host languages.
Choreographic enclaves are a language feature that lets the programmer define
sub-choreographies within a larger choreography. Within an enclave, "knowledge
of choice" is propagated only among the enclave's participants, enabling the
seamless use of the host language's conditional constructs while addressing the
efficiency limitations of existing library-level CP implementations.
We implement EPP-as-DI and choreographic enclaves in ChoRus, the first CP
library for the Rust programming language. Our case studies and benchmarks
demonstrate that the usability and performance of ChoRus compares favorably to
traditional distributed programming in Rust
Parallelizing Julia with a Non-Invasive DSL (Artifact)
This artifact is based on ParallelAccelerator, an embedded domain-specific language (DSL) and compiler for speeding up compute-intensive Julia programs. In particular, Julia code that makes heavy use of aggregate array operations is a good candidate for speeding up with ParallelAccelerator. ParallelAccelerator is a non-invasive DSL that makes as few changes to the host programming model as possible
Parallelizing Julia with a Non-Invasive DSL
Computational scientists often prototype software using productivity
languages that offer high-level programming abstractions. When higher
performance is needed, they are obliged to rewrite their code in a
lower-level efficiency language. Different solutions have been
proposed to address this trade-off between productivity and
efficiency. One promising approach is to create embedded
domain-specific languages that sacrifice generality for productivity
and performance, but practical experience with DSLs points to some
road blocks preventing widespread adoption. This paper proposes a
non-invasive domain-specific language that makes as few visible
changes to the host programming model as possible. We present ParallelAccelerator,
a library and compiler for high-level, high-performance scientific
computing in Julia. ParallelAccelerator\u27s programming model is aligned with existing
Julia programming idioms. Our compiler exposes the implicit
parallelism in high-level array-style programs and compiles them to
fast, parallel native code. Programs can also run in "library-only"
mode, letting users benefit from the full Julia environment and
libraries. Our results show encouraging performance improvements with very few changes to source code required. In particular, few to no additional type annotations are necessary
Toward Scalable Verification for Safety-Critical Deep Networks
The increasing use of deep neural networks for safety-critical applications, such as autonomous driving and flight control, raises concerns about their safety and reliability. Formal verification can address these concerns by guaranteeing that a deep learning system operates as intended, but the state of the art is limited to small systems. In this work-in-progress report we give an overview of our work on mitigating this difficulty, by pursuing two complementary directions: devising scalable verification techniques, and identifying design choices that result in deep learning systems that are more amenable to verification
Lattice-based data structures for deterministic parallel and distributed programming
Deterministic-by-construction parallel programming models guarantee that programs have the same observable behavior on every run, promising freedom from bugs caused by schedule nondeterminism. To make that guarantee, though, they must sharply restrict sharing of state between parallel tasks, usually either by disallowing sharing entirely or by restricting it to one type of data structure, such as single-assignment locations. I show that lattice-based data structures, or LVars, are the foundation for a guaranteed-deterministic parallel programming model that allows a more general form of sharing. LVars allow multiple assignments that are inflationary with respect to a given lattice. They ensure determinism by allowing only inflationary writes and "threshold" reads that block until a lower bound is reached. After presenting the basic LVars model, I extend it to support event handlers, which enable an event-driven programming style, and non-blocking "freezing" reads, resulting in a quasi-deterministic model in which programs behave deterministically modulo exceptions. I demonstrate the viability of the LVars model with LVish, a Haskell library that provides a collection of lattice-based data structures, a work-stealing scheduler, and a monad in which LVar computations run. LVish leverages Haskell's type system to index such computations with effect levels to ensure that only certain LVar effects can occur, hence statically enforcing determinism or quasi-determinism. I present two case studies of parallelizing existing programs using LVish: a k-CFA control flow analysis, and a bioinformatics application for comparing phylogenetic trees. Finally, I show how LVar-style threshold reads apply to the setting of convergent replicated data types (CvRDTs), which specify the behavior of eventually consistent replicated objects in a distributed system. I extend the CvRDT model to support deterministic, strongly consistent threshold queries. The technique generalizes to any lattice, and hence any CvRDT, and allows deterministic observations to be made of replicated objects before the replicas' states converge
Recommended from our members
HasChor: Functional Choreographic Programming for All (Functional Pearl)
Choreographic programming is an emerging paradigm for programming distributed systems. In choreographic programming, the programmer describes the behavior of the entire system as a single, unified program -- a
choreography
-- which is then compiled to individual programs that run on each node, via a compilation step called endpoint projection. We present a new model for functional choreographic programming where choreographies are expressed as computations in a monad. Our model supports cutting-edge choreographic programming features that enable modularity and code reuse: in particular, it supports
higher-order
choreographies, in which a choreography may be passed as an argument to another choreography, and
location-polymorphic
choreographies, in which a choreography can abstract over nodes. Our model is implemented in a Haskell library,
HasChor
, which lets programmers write choreographic programs while using the rich Haskell ecosystem at no cost, bringing choreographic programming within reach of everyday Haskellers. Moreover, thanks to Haskell's abstractions, the implementation of the HasChor library itself is concise and understandable, boiling down endpoint projection to its short and simple essence