1,346 research outputs found
Execution replay and debugging
As most parallel and distributed programs are internally non-deterministic --
consecutive runs with the same input might result in a different program flow
-- vanilla cyclic debugging techniques as such are useless. In order to use
cyclic debugging tools, we need a tool that records information about an
execution so that it can be replayed for debugging. Because recording
information interferes with the execution, we must limit the amount of
information and keep the processing of the information fast. This paper
contains a survey of existing execution replay techniques and tools.Comment: In M. Ducasse (ed), proceedings of the Fourth International Workshop
on Automated Debugging (AADebug 2000), August 2000, Munich. cs.SE/001003
Out-Of-Place debugging: a debugging architecture to reduce debugging interference
Context. Recent studies show that developers spend most of their programming
time testing, verifying and debugging software. As applications become more and
more complex, developers demand more advanced debugging support to ease the
software development process.
Inquiry. Since the 70's many debugging solutions were introduced. Amongst
them, online debuggers provide a good insight on the conditions that led to a
bug, allowing inspection and interaction with the variables of the program.
However, most of the online debugging solutions introduce \textit{debugging
interference} to the execution of the program, i.e. pauses, latency, and
evaluation of code containing side-effects.
Approach. This paper investigates a novel debugging technique called
\outofplace debugging. The goal is to minimize the debugging interference
characteristic of online debugging while allowing online remote capabilities.
An \outofplace debugger transfers the program execution and application state
from the debugged application to the debugger application, both running in
different processes.
Knowledge. On the one hand, \outofplace debugging allows developers to debug
applications remotely, overcoming the need of physical access to the machine
where the debugged application is running. On the other hand, debugging happens
locally on the remote machine avoiding latency. That makes it suitable to be
deployed on a distributed system and handle the debugging of several processes
running in parallel.
Grounding. We implemented a concrete out-of-place debugger for the Pharo
Smalltalk programming language. We show that our approach is practical by
performing several benchmarks, comparing our approach with a classic remote
online debugger. We show that our prototype debugger outperforms by a 1000
times a traditional remote debugger in several scenarios. Moreover, we show
that the presence of our debugger does not impact the overall performance of an
application.
Importance. This work combines remote debugging with the debugging experience
of a local online debugger. Out-of-place debugging is the first online
debugging technique that can minimize debugging interference while debugging a
remote application. Yet, it still keeps the benefits of online debugging ( e.g.
step-by-step execution). This makes the technique suitable for modern
applications which are increasingly parallel, distributed and reactive to
streams of data from various sources like sensors, UI, network, etc
A Runtime Software Visualization Environment
As software systems become more complex, so does the task of understanding them. To modify even a simple component of a complex system, at least a rudimentary understanding of the structure and behavior of the whole system is necessary. Although currently available development tools can provide a static representation of a complex system, these utilities are severely limited and prohibitively expensive. As a result, most programmers working on large software systems today resort to classic debuggers and time-consuming plain-text searches through hundreds or thousands of source files. This proposal describes a software development environment that uses static representations of hierarchically structured source code side by side with dynamic visualizations of software systems as they run. This environment provides an intuitive, visual means of easily comprehending complex systems, and has been provided as an open-source development tool for both professionals and students of software engineering
COST Action IC 1402 ArVI: Runtime Verification Beyond Monitoring -- Activity Report of Working Group 1
This report presents the activities of the first working group of the COST
Action ArVI, Runtime Verification beyond Monitoring. The report aims to provide
an overview of some of the major core aspects involved in Runtime Verification.
Runtime Verification is the field of research dedicated to the analysis of
system executions. It is often seen as a discipline that studies how a system
run satisfies or violates correctness properties. The report exposes a taxonomy
of Runtime Verification (RV) presenting the terminology involved with the main
concepts of the field. The report also develops the concept of instrumentation,
the various ways to instrument systems, and the fundamental role of
instrumentation in designing an RV framework. We also discuss how RV interplays
with other verification techniques such as model-checking, deductive
verification, model learning, testing, and runtime assertion checking. Finally,
we propose challenges in monitoring quantitative and statistical data beyond
detecting property violation
Recommended from our members
Building Reliable Software for Persistent Memory
Persistent memory (PMEM) technologies preserve data across power cycles and provide performance comparable to DRAM. In emerging computer systems, PMEM will operate on the main memory bus, becoming byte-addressable and cache-coherent. One key feature enabled by persistent memory is to allow software directly accessing durable data using the CPU’s load/store instructions, even from the user-space.However, building reliable software for persistent memory faces new challenges from two aspects: crash consistency and fault tolerance. Maintaining crash consistency requires the ability to recover data integrity in the event of system crashes. Using load/store instructions to access durable data introduces a new programming paradigm, that is prone to new types of programming errors. Fault tolerance involves detecting and recovering from persistent memory errors, including memory media errors and scribbles from software bugs. With direct access, file systems and user-space applications have to explicitly manage these errors, instead of relying on convenient functions from lower I/O stacks.We identify unique challenges in improving reliability for PMEM-based software and propose solutions. The thesis first introduces NOVA-Fortis, a fault-tolerant PMEM file system incorporating replication, checksums, and parity for protecting the file system’s metadata and the user’s file data. NOVA-Fortis is both fast and resilient in the face of corruption due to media errors and software bugs.NOVA-Fortis only protects file data via the read() and write() system calls. When an application memory-maps a PMEM file, NOVA-Fortis has to disable file data protection because mmap() leaves the file system unaware of updates made to the file. For protecting memory-mapped PMEM data, we present Pangolin, a fault-tolerant persistent object library to protect an application’s objects from persistent memory errors.Writing programs to ensure crash consistency in PMEM remains challenging. Recovery bugs arise as a new type of programming error, preventing a post-crash PMEM file from recovering to a consistent state. Thus, we design two debugging tools for persistent memory programming: PmemConjurer and PmemSanitizer. PmemConjurer is a static analyzer using symbolic execution to find recovery bugs without running a compiled program. PmemSanitizer contains compiler instrumentation and run-time recovery bug analysis, compensating PmemConjurer with multi-threading support and store reordering tests
A debugging engine for parallel and distributed programs
Dissertação apresentada para a obtenção
do Grau de Doutor em Informática pela
Universidade Nova de Lisboa, Faculdade
de Ciências e Tecnologia.In the last decade a considerable amount of research work has focused on distributed
debugging, one of the crucial fields in the parallel software development cycle. The
productivity of the software development process strongly depends on the adequate
definition of what debugging tools should be provided, and what debugging methodologies
and functionalities should these tools support.
The work described in this dissertation was initiated in 1995, in the context of two
research projects, the SEPP (Software Engineering for Parallel Processing) and HPCTI
(High-Performance Computing Tools for Industry), both sponsored by the European
Union in the Copernicus programme, which aimed at the design and implementation
of an integrated parallel software development environment. In the context of these
projects, two independent toolsets have been developed, the GRADE and EDPEPPS
parallel software development environments.
Our contribution to these projects was in the debugging support. We have designed
a debugging engine and developed a prototype, which was integrated the both
toolsets (it was the only tool developed in the context of the SEPP and HPCTI projects
which achieved such a result). Even after the closing of those research projects, further
research work on distributed debugger has been carried on, which conducted to the
re-design and re-implementation of the debugging engine.
This dissertation describes the debugging engine according to its most up-to-date
design and implementation stages. It also reposts some of the experimentalworkmade
with both the initial and the current implementations, and how it contributed to validate
the design and implementations of the debugging engine
Recommended from our members
Effective Performance Analysis and Debugging
Performance is once again a first-class concern. Developers can no longer wait for the next generation of processors to automatically optimize their software. Unfortunately, existing techniques for performance analysis and debugging cannot cope with complex modern hardware, concurrent software, or latency-sensitive software services.
While processor speeds have remained constant, increasing transistor counts have allowed architects to increase processor complexity. This complexity often improves performance, but the benefits can be brittle; small changes to a program’s code, inputs, or execution environment can dramatically change performance, resulting in unpredictable performance in deployed software and complicating performance evaluation and debugging. Developers seeking to improve performance must resort to manual performance tuning for large performance gains. Software profilers are meant to guide developers to important code, but conventional profilers do not produce actionable information for concurrent applications. These profilers report where a program spends its time, not where optimizations will yield performance improvements. Furthermore, latency is a critical measure of performance for software services and interactive applications, but conventional profilers measure only throughput. Many performance issues appear only when a system is under high load, but generating this load in development is often impossible. Developers need to identify and mitigate scalability issues before deploying software, but existing tools offer developers little or no assistance.
In this dissertation, I introduce an empirically-driven approach to performance analysis and debugging. I present three systems for performance analysis and debugging. Stabilizer mitigates the performance variability that is inherent in modern processors, enabling both predictable performance in deployment and statistically sound performance evaluation. Coz conducts performance experiments using virtual speedups to create the effect of an optimization in a running application. This approach accurately predicts the effect of hypothetical optimizations, guiding developers to code where optimizations will have the largest effect. Amp allows developers to evaluate system scalability using load amplification to create the effect of high load in a testing environment. In combination, Amp and Coz allow developers to pinpoint code where manual optimizations will improve the scalability of their software
A Domain Specific Language Based Approach for Generating Deadlock-Free Parallel Load Scheduling Protocols for Distributed Systems
In this dissertation, the concept of using domain specific language to develop errorree parallel asynchronous load scheduling protocols for distributed systems is studied. The motivation of this study is rooted in addressing the high cost of verifying parallel asynchronous load scheduling protocols. Asynchronous parallel applications are prone to subtle bugs such as deadlocks and race conditions due to the possibility of non-determinism. Due to this non-deterministic behavior, traditional testing methods are less effective at finding software faults. One approach that can eliminate these software bugs is to employ model checking techniques that can verify that non-determinism will not cause software faults in parallel programs. Unfortunately, model checking requires the development of a verification model of a program in a separate verification language which can be an error-prone procedure and may not properly represent the semantics of the original system. The model checking approach can provide true positive result if the semantics of an implementation code and a verification model is represented under a single framework such that the verification model closely represents the implementation and the automation of a verification process is natural. In this dissertation, a domain specific language based verification framework is developed to design parallel load scheduling protocols and automatically verify their behavioral properties through model checking. A specification language, LBDSL, is introduced that facilitates the development of parallel load scheduling protocols. The LBDSL verification framework uses model checking techniques to verify the asynchronous behavior of the protocol. It allows the same protocol specification to be used for verification and the code generation. The support to automatic verification during protocol development reduces the verification cost post development. The applicability of LBDSL verification framework is illustrated by performing case study on three different types of load scheduling protocols. The study shows that the LBDSL based verification approach removes the need of debugging for deadlocks and race bugs which has potential to significantly lower software development costs
SOFTVIZ... A Step Forward
Complex software systems are difficult to understand and very hard to debug. Programmers trying to understand or debug these systems must read through source code which may span over thousands of files. Software Visualization tries to ease this burden by using graphics and animation to convey important information about the program to the user, which may be used either for understanding the behavior of the program or for detecting any defects within the code. SoftViz is one such software visualization system, developed by Ben Kurtz under the guidance of Prof. George T. Heineman at WPI. We carry forward the work initiated with SoftViz. Our preliminary study showed various avenues for making the system more effective and user-friendly. Specifically I completed the unfinished work, made optimizations, implemented new functionality and added new visualization plug-ins, all aimed at making the system a more versatile and user-friendly debugging framework. We built a solid core functionality that would be able to support various functionalities and created new plug-ins that would make understanding and bug-detection easier. Further we integrated SoftViz with the Eclipse development environment, making the system easily accessible and potentially widely used. We created an error classification framework relating the common error classes and the visualizations that could be used to detect them. We believe this will be helpful in both selecting the right visualization options as well as constructing new plug-ins
Detecting and correcting errors in parallel object oriented systems
Our research concerns the development of an operational formalism for the in-source specification of parallel, object oriented systems. These specifications are used to enunciate the behavioural semantics of objects, as a means of enhancing their reliability. A review of object oriented languages concludes that the advance in language sophistication heralded by the object oriented paradigm has, so far, failed to produce a commensurate increase in software reliability. The lack of support in modern object oriented languages for the notion of 'valid object behaviour', as distinct from state and operations, undermines the potential power of the abstraction. Furthermore, it weakens the ability of such languages to detect behavioural problems, manifest at run-time. As a result, in-language facilities for the signalling and handling of undesirable program behaviours or states (for example, assertions) are still in their infancy. This is especially true of parallel systems, where the scope for subtle error is greater. The first goal of this work was to construct an operational model of a general purpose, parallel, object oriented system in order to ascertain the fundamental set of event classes that constitute its observable behaviour. Our model is built on the CSP process calculus and uses a subset of the Z notation to express some aspects of state. This alphabet was then used to construct a formalism designed to augment each object type description with the operational specification of an object's behaviour: Event Pattern Specifications (EPS). EPSs are a labeled list of acceptable object behaviours which form part of the definition of every type. The thesis includes a description of the design and implementation of EPSs as part of an exception handling mechanism for the parallel, object oriented language Solve. Using this implementation, we have established that the run-time checking of EPS specifications is feasible, albeit it with considerable overhead. Issues arising from this implementation are discussed and we describe the visualization of EPSs and their use in semantic browsing
- …