1,127 research outputs found

    Execution replay and debugging

    Full text link
    As most parallel and distributed programs are internally non-deterministic -- consecutive runs with the same input might result in a different program flow -- vanilla cyclic debugging techniques as such are useless. In order to use cyclic debugging tools, we need a tool that records information about an execution so that it can be replayed for debugging. Because recording information interferes with the execution, we must limit the amount of information and keep the processing of the information fast. This paper contains a survey of existing execution replay techniques and tools.Comment: In M. Ducasse (ed), proceedings of the Fourth International Workshop on Automated Debugging (AADebug 2000), August 2000, Munich. cs.SE/001003

    A Study of Concurrency Bugs and Advanced Development Support for Actor-based Programs

    Full text link
    The actor model is an attractive foundation for developing concurrent applications because actors are isolated concurrent entities that communicate through asynchronous messages and do not share state. Thereby, they avoid concurrency bugs such as data races, but are not immune to concurrency bugs in general. This study taxonomizes concurrency bugs in actor-based programs reported in literature. Furthermore, it analyzes the bugs to identify the patterns causing them as well as their observable behavior. Based on this taxonomy, we further analyze the literature and find that current approaches to static analysis and testing focus on communication deadlocks and message protocol violations. However, they do not provide solutions to identify livelocks and behavioral deadlocks. The insights obtained in this study can be used to improve debugging support for actor-based programs with new debugging techniques to identify the root cause of complex concurrency bugs.Comment: - Submitted for review - Removed section 6 "Research Roadmap for Debuggers", its content was summarized in the Future Work section - Added references for section 1, section 3, section 4.3 and section 5.1 - Updated citation

    Deterministic Consistency: A Programming Model for Shared Memory Parallelism

    Full text link
    The difficulty of developing reliable parallel software is generating interest in deterministic environments, where a given program and input can yield only one possible result. Languages or type systems can enforce determinism in new code, and runtime systems can impose synthetic schedules on legacy parallel code. To parallelize existing serial code, however, we would like a programming model that is naturally deterministic without language restrictions or artificial scheduling. We propose "deterministic consistency", a parallel programming model as easy to understand as the "parallel assignment" construct in sequential languages such as Perl and JavaScript, where concurrent threads always read their inputs before writing shared outputs. DC supports common data- and task-parallel synchronization abstractions such as fork/join and barriers, as well as non-hierarchical structures such as producer/consumer pipelines and futures. A preliminary prototype suggests that software-only implementations of DC can run applications written for popular parallel environments such as OpenMP with low (<10%) overhead for some applications.Comment: 7 pages, 3 figure

    Static Trace-Based Deadlock Analysis for Synchronous Mini-Go

    Full text link
    We consider the problem of static deadlock detection for programs in the Go programming language which make use of synchronous channel communications. In our analysis, regular expressions extended with a fork operator capture the communication behavior of a program. Starting from a simple criterion that characterizes traces of deadlock-free programs, we develop automata-based methods to check for deadlock-freedom. The approach is implemented and evaluated with a series of examples

    OPR

    Get PDF
    The ability to reproduce a parallel execution is desirable for debugging and program reliability purposes. In debugging (13), the programmer needs to manually step back in time, while for resilience (6) this is automatically performed by the the application upon failure. To be useful, replay has to faithfully reproduce the original execution. For parallel programs the main challenge is inferring and maintaining the order of conflicting operations (data races). Deterministic record and replay (R&amp;R) techniques have been developed for multithreaded shared memory programs (5), as well as distributed memory programs (14). Our main interest is techniques for large scale scientific (3; 4) programming models

    Doctor of Philosophy

    Get PDF
    dissertationAlmost all high performance computing applications are written in MPI, which will continue to be the case for at least the next several years. Given the huge and growing importance of MPI, and the size and sophistication of MPI codes, scalable and incisive MPI debugging tools are essential. Existing MPI debugging tools have, despite their strengths, many glaring de ficiencies, especially when it comes to debugging under the presence of nondeterminism related bugs, which are bugs that do not always show up during testing. These bugs usually become manifest when the systems are ported to di fferent platforms for production runs. This dissertation focuses on the problem of developing scalable dynamic verifi cation tools for MPI programs that can provide a coverage guarantee over the space of MPI nondeterminism. That is, the tools should be able to detect diff erent outcomes of nondeterministic events in an MPI program and enforce all those di fferent outcomes through repeated executions of the program with the same test harness. We propose to achieve the coverage guarantee by introducing efficient distributed causality tracking protocols that are based on the matches-before order. The matches-before order is introduced to address the shortcomings of the Lamport happens-before order [40], which is not sufficient to capture causality for MPI program executions due to the complexity of the MPI semantics. The two protocols we propose are the Lazy Lamport Clocks Protocol (LLCP) and the Lazy Vector Clocks Protocol (LVCP). LLCP provides good scalability with a small possibility of missing potential outcomes of nondeterministic events while LVCP provides full coverage guarantee with a scalability tradeoff . In practice, we show through our experiments that LLCP provides the same coverage as LVCP. This thesis makes the following contributions: •The MPI matches-before order that captures the causality between MPI events in an MPI execution. • Two distributed causality tracking protocols for MPI programs that rely on the matches-before order. • A Distributed Analyzer for MPI programs (DAMPI), which implements the two aforementioned protocols to provide scalable and modular dynamic verifi cation for MPI programs. • Scalability enhancement through algorithmic improvements for ISP, a dynamic verifi er for MPI programs
    • …
    corecore