1,006 research outputs found

    Efficient and Deterministic Record & Replay for Actor Languages

    Get PDF
    With the ubiquity of parallel commodity hardware, developers turn to high-level concurrency models such as the actor model to lower the complexity of concurrent software. However, debugging concurrent software is hard, especially for concurrency models with a limited set of supporting tools. Such tools often deal only with the underlying threads and locks, which obscures the view on e.g. actors and messages and thereby introduces additional complexity. To improve on this situation, we present a low-overhead record & replay approach for actor languages. It allows one to debug concurrency issues deterministically based on a previously recorded trace. Our evaluation shows that the average run-time overhead for tracing on benchmarks from the Savina suite is 10% (min. 0%, max. 20%). For Acme-Air, a modern web application, we see a maximum increase of 1% in latency for HTTP requests and about 1.4 MB/s of trace data. These results are a first step towards deterministic replay debugging of actor systems in production

    Simplifying Contract-Violating Traces

    Full text link
    Contract conformance is hard to determine statically, prior to the deployment of large pieces of software. A scalable alternative is to monitor for contract violations post-deployment: once a violation is detected, the trace characterising the offending execution is analysed to pinpoint the source of the offence. A major drawback with this technique is that, often, contract violations take time to surface, resulting in long traces that are hard to analyse. This paper proposes a methodology together with an accompanying tool for simplifying traces and assisting contract-violation debugging.Comment: In Proceedings FLACOS 2012, arXiv:1209.169

    Optimizing Abstract Abstract Machines

    Full text link
    The technique of abstracting abstract machines (AAM) provides a systematic approach for deriving computable approximations of evaluators that are easily proved sound. This article contributes a complementary step-by-step process for subsequently going from a naive analyzer derived under the AAM approach, to an efficient and correct implementation. The end result of the process is a two to three order-of-magnitude improvement over the systematically derived analyzer, making it competitive with hand-optimized implementations that compute fundamentally less precise results.Comment: Proceedings of the International Conference on Functional Programming 2013 (ICFP 2013). Boston, Massachusetts. September, 201

    Causal-Consistent Replay Debugging for Message Passing Programs

    Get PDF
    Debugging of concurrent systems is a tedious and error-prone activity. A main issue is that there is no guarantee that a bug that appears in the original computation is replayed inside the debugger. This problem is usually tackled by so-called replay debugging, which allows the user to record a program execution and replay it inside the debugger. In this paper, we present a novel technique for replay debugging that we call controlled causal-consistent replay. Controlled causal-consistent replay allows the user to record a program execution and, in contrast to traditional replay debuggers, to reproduce a visible misbehavior inside the debugger including all and only its causes. In this way, the user is not distracted by the actions of other, unrelated processes

    Capturing High-level Nondeterminism in Concurrent Programs for Practical Concurrency Model Agnostic Record and Replay

    Get PDF
    With concurrency being integral to most software systems, developers combine high-level concurrency models in the same application to tackle each problem with appropriate abstractions. While languages and libraries offer a wide range of concurrency models, debugging support for applications that combine them has not yet gained much attention. Record & replay aids debugging by deterministically reproducing recorded bugs, but is typically designed for a single concurrency model only. This paper proposes a practical concurrency-model-agnostic record & replay approach for multi-paradigm concurrent programs, i.e. applications that combine concurrency models. Our approach traces high-level non- deterministic events by using a uniform model-agnostic trace format and infrastructure. This enables ordering- based record & replay support for a wide range of concurrency models, and thereby enables debugging of applications that combine them. In addition, it allows language implementors to add new concurrency mod- els and reuse the model-agnostic record & replay support. We argue that a concurrency-model-agnostic record & replay is practical and enables advanced debugging support for a wide range of concurrency models. The evaluation shows that our approach is expressive and flexible enough to support record & replay of applications using threads & locks, communicating event loops, communicating sequential processes, software transactional memory and combinations of those concurrency models. For the actor model, we reach recording performance competitive with an optimized special-purpose record & replay solution. The average recording overhead on the Savina actor benchmark suite is 10% (min. 0%, max. 23%). The performance for other concurrency models and combinations thereof is at a similar level. We believe our concurrency-model-agnostic approach helps developers of applications that mix and match concurrency models. We hope that this substrate inspires new tools and languages making building and maintaining of multi-paradigm concurrent applications simpler and safer

    Safety-Critical Learning of Robot Control with Temporal Logic Specifications

    Full text link
    Reinforcement learning (RL) is a promising approach. However, success is limited to real-world applications, because ensuring safe exploration and facilitating adequate exploitation is a challenge for controlling robotic systems with unknown models and measurement uncertainties. The learning problem becomes even more difficult for complex tasks over continuous state-action. In this paper, we propose a learning-based robotic control framework consisting of several aspects: (1) we leverage Linear Temporal Logic (LTL) to express complex tasks over infinite horizons that are translated to a novel automaton structure; (2) we detail an innovative reward scheme for LTL satisfaction with a probabilistic guarantee. Then, by applying a reward shaping technique, we develop a modular policy-gradient architecture exploiting the benefits of the automaton structure to decompose overall tasks and enhance the performance of learned controllers; (3) by incorporating Gaussian Processes (GPs) to estimate the uncertain dynamic systems, we synthesize a model-based safe exploration during the learning process using Exponential Control Barrier Functions (ECBFs) that generalize systems with high-order relative degrees; (4) to further improve the efficiency of exploration, we utilize the properties of LTL automata and ECBFs to propose a safe guiding process. Finally, we demonstrate the effectiveness of the framework via several robotic environments. We show an ECBF-based modular deep RL algorithm that achieves near-perfect success rates and safety guarding with high probability confidence during training.Comment: Under Review. arXiv admin note: text overlap with arXiv:2102.1285

    Asynchronous Snapshots of Actor Systems for Latency-Sensitive Applications

    Get PDF
    The actor model is popular for many types of server applications. Efficient snapshotting of applications is crucial in the deployment of pre-initialized applications or moving running applications to different machines, e.g for debugging purposes. A key issue is that snapshotting blocks all other operations. In modern latency-sensitive applications, stopping the application to persist its state needs to be avoided, because users may not tolerate the increased request latency. In order to minimize the impact of snapshotting on request latency, our approach persists the application’s state asynchronously by capturing partial heaps, completing snapshots step by step. Additionally, our solution is transparent and supports arbitrary object graphs. We prototyped our snapshotting approach on top of the Truffle/Graal platform and evaluated it with the Savina benchmarks and the Acme Air microservice application. When performing a snapshot every thousand Acme Air requests, the number of slow requests ( 0.007% of all requests) with latency above 100ms increases by 5.43%. Our Savina microbenchmark results detail how different utilization patterns impact snapshotting cost. To the best of our knowledge, this is the first system that enables asynchronous snapshotting of actor applications, i.e. without stop-the-world synchronization, and thereby minimizes the impact on latency. We thus believe it enables new deployment and debugging options for actor systems
    • …
    corecore