12,782 research outputs found
TANDEM: taming failures in next-generation datacenters with emerging memory
The explosive growth of online services, leading to unforeseen scales, has made modern datacenters highly prone to failures. Taming these failures hinges on fast and correct recovery, minimizing service interruptions.
Applications, owing to recovery, entail additional measures to maintain a recoverable state of data and computation logic during their failure-free execution. However, these precautionary measures have
severe implications on performance, correctness, and programmability, making recovery incredibly challenging to realize in practice.
Emerging memory, particularly non-volatile memory (NVM) and disaggregated memory (DM), offers a promising opportunity to achieve fast recovery with maximum performance. However, incorporating these technologies into datacenter architecture presents significant challenges; Their distinct architectural attributes, differing significantly from traditional memory devices, introduce new semantic challenges for
implementing recovery, complicating correctness and programmability.
Can emerging memory enable fast, performant, and correct recovery in the datacenter? This thesis aims to answer this question while addressing the associated challenges.
When architecting datacenters with emerging memory, system architects face four key challenges: (1) how to guarantee correct semantics; (2) how to efficiently enforce correctness with optimal performance; (3) how to validate end-to-end correctness including recovery; and (4) how to preserve programmer productivity (Programmability).
This thesis aims to address these challenges through the following approaches: (a)
defining precise consistency models that formally specify correct end-to-end semantics
in the presence of failures (consistency models also play a crucial role in programmability); (b) developing new low-level mechanisms to efficiently enforce the prescribed models given the capabilities of emerging memory; and (c) creating robust testing frameworks to validate end-to-end correctness and recovery.
We start our exploration with non-volatile memory (NVM), which offers fast persistence capabilities directly accessible through the processor’s load-store (memory) interface. Notably, these capabilities can be leveraged to enable fast recovery for Log-Free Data Structures (LFDs) while maximizing performance. However, due to the complexity of modern cache hierarchies, data hardly persist in any specific order, jeop-
ardizing recovery and correctness. Therefore, recovery needs primitives that explicitly control the order of updates to NVM (known as persistency models). We outline the precise specification of a novel persistency model – Release Persistency (RP) – that provides a consistency guarantee for LFDs on what remains in non-volatile memory upon failure. To efficiently enforce RP, we propose a novel microarchitecture mechanism,
lazy release persistence (LRP). Using standard LFDs benchmarks, we show that LRP achieves fast recovery while incurring minimal overhead on performance.
We continue our discussion with memory disaggregation which decouples memory from traditional monolithic servers, offering a promising pathway for achieving very high availability in replicated in-memory data stores. Achieving such availability hinges on transaction protocols that can efficiently handle recovery in this setting, where
compute and memory are independent. However, there is a challenge: disaggregated memory (DM) fails to work with RPC-style protocols, mandating one-sided transaction protocols. Exacerbating the problem, one-sided transactions expose critical low-level
ordering to architects, posing a threat to correctness. We present a highly available transaction protocol, Pandora, that is specifically designed to achieve fast recovery in disaggregated key-value stores (DKVSes).
Pandora is the first one-sided transactional protocol that ensures correct, non-blocking, and fast recovery in DKVS. Our experimental implementation artifacts demonstrate that Pandora achieves fast recovery and high availability while causing minimal disruption to services.
Finally, we introduce a novel target litmus-testing framework – DART – to validate the end-to-end correctness of transactional protocols with recovery. Using DART’s target testing capabilities, we have found several critical bugs in Pandora, highlighting the need for robust end-to-end testing methods in the design loop to iteratively fix correctness bugs. Crucially, DART is lightweight and black-box, thereby eliminating
any intervention from the programmers
A foundation for synthesising programming language semantics
Programming or scripting languages used in real-world systems are seldom designed
with a formal semantics in mind from the outset. Therefore, the first step for developing well-founded analysis tools for these systems is to reverse-engineer a formal
semantics. This can take months or years of effort.
Could we automate this process, at least partially? Though desirable, automatically reverse-engineering semantics rules from an implementation is very challenging,
as found by Krishnamurthi, Lerner and Elberty. They propose automatically learning
desugaring translation rules, mapping the language whose semantics we seek to a simplified, core version, whose semantics are much easier to write. The present thesis
contains an analysis of their challenge, as well as the first steps towards a solution.
Scaling methods with the size of the language is very difficult due to state space
explosion, so this thesis proposes an incremental approach to learning the translation
rules. I present a formalisation that both clarifies the informal description of the challenge by Krishnamurthi et al, and re-formulates the problem, shifting the focus to the
conditions for incremental learning. The central definition of the new formalisation is
the desugaring extension problem, i.e. extending a set of established translation rules
by synthesising new ones.
In a synthesis algorithm, the choice of search space is important and non-trivial,
as it needs to strike a good balance between expressiveness and efficiency. The rest
of the thesis focuses on defining search spaces for translation rules via typing rules.
Two prerequisites are required for comparing search spaces. The first is a series of
benchmarks, a set of source and target languages equipped with intended translation
rules between them. The second is an enumerative synthesis algorithm for efficiently
enumerating typed programs. I show how algebraic enumeration techniques can be applied to enumerating well-typed translation rules, and discuss the properties expected
from a type system for ensuring that typed programs be efficiently enumerable.
The thesis presents and empirically evaluates two search spaces. A baseline search
space yields the first practical solution to the challenge. The second search space is
based on a natural heuristic for translation rules, limiting the usage of variables so that
they are used exactly once. I present a linear type system designed to efficiently enumerate translation rules, where this heuristic is enforced. Through informal analysis
and empirical comparison to the baseline, I then show that using linear types can speed
up the synthesis of translation rules by an order of magnitude
On the Generation of Realistic and Robust Counterfactual Explanations for Algorithmic Recourse
This recent widespread deployment of machine learning algorithms presents many new challenges. Machine learning algorithms are usually opaque and can be particularly difficult to interpret. When humans are involved, algorithmic and automated decisions can negatively impact people’s lives. Therefore, end users would like to be insured against potential harm. One popular way to achieve this is to provide end users access to algorithmic recourse, which gives end users negatively affected by algorithmic decisions the opportunity to reverse unfavorable decisions, e.g., from a loan denial to a loan acceptance. In this thesis, we design recourse algorithms to meet various end user needs. First, we propose methods for the generation of realistic recourses. We use generative models to suggest recourses likely to occur under the data distribution. To this end, we shift the recourse action from the input space to the generative model’s latent space, allowing to generate counterfactuals that lie in regions with data support. Second, we observe that small changes applied to the recourses prescribed to end users likely invalidate the suggested recourse after being nosily implemented in practice. Motivated by this observation, we design methods for the generation of robust recourses and for assessing the robustness of recourse algorithms to data deletion requests. Third, the lack of a commonly used code-base for counterfactual explanation and algorithmic recourse algorithms and the vast array of evaluation measures in literature make it difficult to compare the per formance of different algorithms. To solve this problem, we provide an open source benchmarking library that streamlines the evaluation process and can be used for benchmarking, rapidly developing new methods, and setting up new
experiments. In summary, our work contributes to a more reliable interaction of end users and machine learned models by covering fundamental aspects of the recourse process and suggests new solutions towards generating realistic and robust counterfactual explanations for algorithmic recourse
Automation for network security configuration: state of the art and research trends
The size and complexity of modern computer networks are progressively increasing, as a consequence of novel architectural paradigms such as the Internet of Things and network virtualization. Consequently, a manual orchestration and configuration of network security functions is no more feasible, in an environment where cyber attacks can dramatically exploit breaches related to any minimum configuration error. A new frontier is then the introduction of automation in network security configuration, i.e., automatically designing the architecture of security services and the configurations of network security functions, such as firewalls, VPN gateways, etc. This opportunity has been enabled by modern computer networks technologies, such as virtualization. In view of these considerations, the motivations for the introduction of automation in network security configuration are first introduced, alongside with the key automation enablers. Then, the current state of the art in this context is surveyed, focusing on both the achieved improvements and the current limitations. Finally, possible future trends in the field are illustrated
Fast and Designated-verifier Friendly zkSNARKs in the BPK Model
After the pioneering results proposed by Bellare et al in ASIACRYPT 2016, there have been lots of efforts to construct zero-knowledge succinct non-interactive arguments of knowledge protocols (zk-SNARKs) that satisfy subversion zero knowledge (S-ZK) and standard soundness from the zk-SNARK in the common reference string (CRS) model. The various constructions could be regarded secure in the bare public key (BPK) model because of the equivalence between S-ZK in the CRS model, and uniform non-black-box zero knowledge in the BPK model has been proved by Abdolmaleki et al. in PKC 2020.
In this study, by leveraging the power of random oracle (RO) model, we proposed the first publicly verifiable non-uniform ZK zk-SNARK scheme in the BPK model maintaining comparable efficiency with its conventional counterpart, which can also be compatible with the well-known transformation proposed by Bitansky et al. in TCC 2013 to obtain an efficient designated-verifier zk-SNARK. We achieve this goal by only adding a constant number of elements into the CRS, and using an unconventional but natural method to transform Groth’s zk-SNARK in EUROCRYPT 2016. In addition, we propose a new speed-up technique that provides a trade-off. Specifically, if a logarithmic number of elements are added into the CRS, according to different circuits, the CRS verification time in our construction could be approximately shorter than that in the conventional counterpart
Classical and quantum algorithms for scaling problems
This thesis is concerned with scaling problems, which have a plethora of connections to different areas of mathematics, physics and computer science. Although many structural aspects of these problems are understood by now, we only know how to solve them efficiently in special cases.We give new algorithms for non-commutative scaling problems with complexity guarantees that match the prior state of the art. To this end, we extend the well-known (self-concordance based) interior-point method (IPM) framework to Riemannian manifolds, motivated by its success in the commutative setting. Moreover, the IPM framework does not obviously suffer from the same obstructions to efficiency as previous methods. It also yields the first high-precision algorithms for other natural geometric problems in non-positive curvature.For the (commutative) problems of matrix scaling and balancing, we show that quantum algorithms can outperform the (already very efficient) state-of-the-art classical algorithms. Their time complexity can be sublinear in the input size; in certain parameter regimes they are also optimal, whereas in others we show no quantum speedup over the classical methods is possible. Along the way, we provide improvements over the long-standing state of the art for searching for all marked elements in a list, and computing the sum of a list of numbers.We identify a new application in the context of tensor networks for quantum many-body physics. We define a computable canonical form for uniform projected entangled pair states (as the solution to a scaling problem), circumventing previously known undecidability results. We also show, by characterizing the invariant polynomials, that the canonical form is determined by evaluating the tensor network contractions on networks of bounded size
Formal proofs applied to system models
National audienceUsually, the description of nuclear equipment by the FMEA (Failure Mode and Effects Analysis) method can be of considerable length (up to 5,000 lines); on the other hand, the number of rules used for the verification of this equipment is small. In addition, upstream, there is the question of trust in the tools that generate these descriptions for complex equipment, that is to say, made up of several thousand objects (requirements, functions, interfaces, behaviors)
New formulas for cup- products and fast computation of Steenrod squares
Operations on the cohomology of spaces are important tools enhancing thedescriptive power of this computable invariant. For cohomology with mod 2coefficients, Steenrod squares are the most significant of these operations.Their effective computation relies on formulas defining a cup- construction,a structure on (co)chains which is important in its own right, havingconnections to lattice field theory, convex geometry and higher category theoryamong others. In this article we present new formulas defining a cup-construction, and use them to introduce a fast algorithm for the computation ofSteenrod squares on the cohomology of finite simplicial complexes. Inforthcoming work we use these formulas to axiomatically characterize thecup- construction they define, showing additionally that all other formulasin the literature define the same cup- construction up to isomorphism.<br
Language integrated relational lenses
Relational databases are ubiquitous. Such monolithic databases accumulate large
amounts of data, yet applications typically only work on small portions of the data
at a time. A subset of the database defined as a computation on the underlying
tables is called a view. Querying views is helpful, but it is also desirable to update
them and have these changes be applied to the underlying database. This view
update problem has been the subject of much previous work before, but support
by database servers is limited and only rarely available.
Lenses are a popular approach to bidirectional transformations, a generalization
of the view update problem in databases to arbitrary data. However, perhaps surprisingly, lenses have seldom actually been used to implement updatable views in
databases. Bohannon, Pierce and Vaughan propose an approach to updatable views called relational lenses. However, to the best of our knowledge this
proposal has not been implemented or evaluated prior to the work reported in
this thesis.
This thesis proposes programming language support for relational lenses. Language integrated relational lenses support expressive and efficient view updates,
without relying on updatable view support from the database server. By integrating relational lenses into the programming language, application development
becomes easier and less error-prone, avoiding the impedance mismatch of having
two programming languages. Integrating relational lenses into the language poses
additional challenges. As defined by Bohannon et al. relational lenses completely
recompute the database, making them inefficient as the database scales. The
other challenge is that some parts of the well-formedness conditions are too general for implementation. Bohannon et al. specify predicates using possibly infinite
abstract sets and define the type checking rules using relational algebra.
Incremental relational lenses equip relational lenses with change-propagating semantics that map small changes to the view into (potentially) small changes
to the source tables. We prove that our incremental semantics are functionally
equivalent to the non-incremental semantics, and our experimental results show
orders of magnitude improvement over the non-incremental approach. This thesis introduces a concrete predicate syntax and shows how the required checks
are performed on these predicates and show that they satisfy the abstract predicate specifications. We discuss trade-offs between static predicates that are fully
known at compile time vs dynamic predicates that are only known during execution and introduce hybrid predicates taking inspiration from both approaches.
This thesis adapts the typing rules for relational lenses from sequential composition to a functional style of sub-expressions. We prove that any well-typed
functional relational lens expression can derive a well-typed sequential lens.
We use these additions to relational lenses as the foundation for two practical implementations: an extension of the Links functional language and a library written
in Haskell. The second implementation demonstrates how type-level computation can be used to implement relational lenses without changes to the compiler.
These two implementations attest to the possibility of turning relational lenses
into a practical language feature
- …