137 research outputs found

    Confuzz—a concurrency fuzzer

    Get PDF

    Exposing concurrency failures: a comprehensive survey of the state of the art and a novel approach to reproduce field failures

    Get PDF
    With the rapid advance of multi-core and distributed architectures, concurrent systems are becoming more and more popular. Concurrent systems are extremely hard to develop and validate, as their overall behavior depends on the non-deterministic interleaving of the execution flows that comprise the system. Wrong and unexpected interleavings may lead to concurrency faults that are extremely hard to avoid, detect, and fix due to their non-deterministic nature. This thesis addresses the problem of exposing concurrency failures. Exposing concurrency failures is a crucial activity to locate and fix the related fault and amounts to determine both a test case and an interleaving that trigger the failure. Given the high cost of manually identifying a failure-inducing test case and interleaving among the infinite number of inputs and interleavings of the system, the problem of automatically exposing concurrency failures has been studied by researchers since the late seventies and is still a hot research topic. This thesis advances the research in exposing concurrency failures by proposing two main contributions. The first contribution is a comprehensive survey and taxonomy of the state-of-the-art techniques for exposing concurrency failures. The taxonomy and survey provide a framework that captures the key features of the existing techniques, identify a set of classification criteria to review and compare them, and highlight their strengths and weaknesses, leading to a thorough assessment of the field and paving the road for future progresses. The second contribution of this thesis is a technique to automatically expose and reproduce concurrency field failure. One of the main findings of our survey is that automatically reproducing concurrency field failures is still an open problem, as the few techniques that have been proposed rely on information that may be hard to collect, and identify failure-inducing interleavings but do not synthesize failure-inducing test cases. We propose a technique that advances over state- of-the-art approaches by relying on information that is easily obtainable and by automatically identifying both a failure- inducing test case and interleaving. We empirically demonstrate the effectiveness of our approach on a benchmark of real concurrency failures taken from different popular code bases

    Efficiency Improvements in the Quality Assurance Process for Data Races

    Get PDF
    As the usage of concurrency in software has gained importance in the last years, and is still rising, new types of defects increasingly appeared in software. One of the most prominent and critical types of such new defect types are data races. Although research resulted in an increased effectiveness of dynamic quality assurance regarding data races, the efficiency in the quality assurance process still is a factor preventing widespread practical application. First, dynamic quality assurance techniques used for the detection of data races are inefficient. Too much effort is needed for conducting dynamic quality assurance. Second, dynamic quality assurance techniques used for the analysis of reported data races are inefficient. Too much effort is needed for analyzing reported data races and identifying issues in the source code. The goal of this thesis is to enable efficiency improvements in the process of quality assurance for data races by: (1) analyzing the representation of the dynamic behavior of a system under test. The results are used to focus instrumentation of this system, resulting in a lower runtime overhead during test execution compared to a full instrumentation of this system. (2) Analyzing characteristics and preprocessing of reported data races. The results of the preprocessing are then provided to developers and quality assurance personnel, enabling an analysis and debugging process, which is more efficient than traditional analysis of data race reports. Besides dynamic data race detection, which is complemented by the solution, all steps in the process of dynamic quality assurance for data races are discussed in this thesis. The solution for analyzing UML Activities for nodes possibly executing in parallel to other nodes or themselves is based on a formal foundation using graph theory. A major problem that has been solved in this thesis was the handling of cycles within UML Activities. This thesis provides a dynamic limit for the number of cycle traversals, based on the elements of each UML Activity to be analyzed and their semantics. Formal proofs are provided with regard to the creation of directed acyclic graphs and with regard to their analysis concerning the identification of elements that may be executed in parallel to other elements. Based on an examination of the characteristics of data races and data race reports, the results of dynamic data race detection are preprocessed and the outcome of this preprocessing is presented to users for further analysis. This thesis further provides an exemplary application of the solution idea, of the results of analyzing UML Activities, and an exemplary examination of the efficiency improvement of the dynamic data race detection, which showed a reduction in the runtime overhead of 44% when using the focused instrumentation compared to full instrumentation. Finally, a controlled experiment has been set up and conducted to examine the effects of the preprocessing of reported data races on the efficiency of analyzing data race reports. The results show that the solution presented in this thesis enables efficiency improvements in the analysis of data race reports between 190% and 660% compared to using traditional approaches. Finally, opportunities for future work are shown, which may enable a broader usage of the results of this thesis and further improvements in the efficiency of quality assurance for data races.Da die Verwendung von Concurrency in Software in den letzten Jahren an Bedeutung gewonnen hat, und immer noch gewinnt, sind zunehmend neue Arten von Fehlern in Software aufgetaucht. Eine der prominentesten und kritischsten Arten solcher neuer Fehlertypen sind data races. Auch wenn die Forschung zu einer steigenden EffektivitĂ€t von Verfahren der dynamischen QualitĂ€tssicherung gefĂŒhrt hat, so ist die Effizienz im Prozess der QualitĂ€tssicherung noch immer ein Faktor, der eine weitverbreitete praktische Anwendung verhindert. Zum einen wird zu viel Aufwand benötigt, um dynamische QualitĂ€tssicherung durchzufĂŒhren. Zum anderen sind die Verfahren zur Analyse gemeldeter data races ineffizient; es wird zu viel Aufwand benötigt, um gemeldete data races zu analysieren und Probleme im Quellcode zu identifizieren. Das Ziel dieser Dissertation ist es, Effizienzsteigerungen im QualitĂ€tssicherungsprozess fĂŒr data races zu ermöglichen, durch: (1) Analyse der ReprĂ€sentation des dynamischen Verhaltens des zu testenden Systems. Mit den Ergebnissen wird die Instrumentierung dieses Systems fokussiert, so dass ein im Vergleich zur vollen Instrumentierung des Systems geringerer Mehraufwand an Laufzeit benötigt wird. (2) Analyse der Charakteristiken von und Vorverarbeitung der gemeldeten data races. Die Ergebnisse der Vorverarbeitung werden Mitarbeitenden in der Entwicklung und QualitĂ€tssicherung prĂ€sentiert, so dass ein Analyse- und Fehlerbehebungsprozess ermöglicht wird, welcher effizienter als traditionelle Analysen gemeldeter data races ist. Mit Ausnahme der dynamischen data race Erkennung, welche durch die Lösung komplementiert wird, werden alle Schritte im Prozess der dynamischen QualitĂ€tssicherung fĂŒr data races in dieser Dissertation behandelt. Die Lösung zur Analyse von UML AktivitĂ€ten auf Knoten, die möglicherweise parallel zu sich selbst oder anderen Knoten ausgefĂŒhrt werden, basiert auf einer formalen Grundlage aus dem Bereich der Graphentheorie. Eines der Hauptprobleme, welches gelöst wurde, war die Verarbeitung von Zyklen innerhalb der UML AktivitĂ€ten. Diese Dissertation fĂŒhrt ein dynamisches Limit fĂŒr die Anzahl an ZyklusdurchlĂ€ufen ein, welches die Elemente jeder zu analysierenden UML AktivitĂ€t sowie deren Semantiken berĂŒcksichtigt. Ebenso werden formale Beweise prĂ€sentiert in Bezug auf die Erstellung gerichteter azyklischer Graphen, sowie deren Analyse zur Identifizierung von Elementen, die parallel zu anderen Elementen ausgefĂŒhrt werden können. Auf Basis einer Untersuchung von Charakteristiken von data races sowie Meldungen von data races werden die Ergebnisse der dynamischen Erkennung von data races vorverarbeitet, und das Ergebnis der Vorverarbeitung gemeldeter data races wird Benutzern zur weiteren Analyse prĂ€sentiert. Diese Dissertation umfasst weiterhin eine exemplarische Anwendung der Lösungsidee und der Analyse von UML AktivitĂ€ten, sowie eine exemplarische Untersuchung der Effizienzsteigerung der dynamischen Erkennung von data races. Letztere zeigte eine Reduktion des Mehraufwands an Laufzeit von 44% bei fokussierter Instrumentierung im Vergleich zu voller Instrumentierung auf. Abschließend wurde ein kontrolliertes Experiment aufgesetzt und durchgefĂŒhrt, um die Effekte der Vorverarbeitung gemeldeter data races auf die Effizienz der Analyse dieser gemeldeten data races zu untersuchen. Die Ergebnisse zeigen, dass die in dieser Dissertation vorgestellte Lösung verglichen mit traditionellen AnsĂ€tzen Effizienzsteigerungen in der Analyse gemeldeter data races von 190% bis zu 660% ermöglicht. Abschließend werden Möglichkeiten fĂŒr zukĂŒnftige Arbeiten vorgestellt, welche eine breitere Anwendung der Ergebnisse dieser Dissertation ebenso wie weitere Effizienzsteigerungen im QualitĂ€tssicherungsprozess fĂŒr data races ermöglichen können

    Finding and Tolerating Concurrency Bugs.

    Full text link
    Shared-memory multi-threaded programming is inherently more difficult than single-threaded programming. The main source of complexity is that, the threads of an application can interleave in so many different ways. To ensure correctness, a programmer has to test all possible thread interleavings, which, however, is impractical. Many rare thread interleavings remain untested in production systems, and they are the major cause for a majority of concurrency bugs. Given that untested interleavings are the major cause of a majority of the concurrency bugs, this dissertation explores two possible ways to tackle concurrency bugs in this dissertation. One is to expose untested interleavings during testing to find concurrency bugs. The other is to avoid untested interleavings during production runs to tolerate concurrency bugs. The key is an efficient and effective way to encode and remember tested interleavings. This dissertation first discusses two hypotheses about concurrency bugs: the small scope hypothesis and the value independent hypothesis. Based on these two hypotheses, this dissertation defines a set of interleaving patterns, called interleaving idioms, which are used to encode tested interleavings. The empirical analysis shows that the idiom based interleaving encoding scheme is able to represent most of the concurrency bugs that are used in the study. Then, this dissertation discusses an open source testing tool called Maple. It memoizes tested interleavings and actively seeks to expose untested interleavings. The results show that Maple is able to expose concurrency bugs and expose interleavings faster than other conventional testing techniques. Finally, this dissertation discusses two parallel runtime system designs which seek to avoid untested interleavings during production runs to tolerate concurrency bugs. Avoiding untested interleavings significantly improve correctness because most of the concurrency bugs are caused by untested interleavings. Also, the performance overhead for disallowing untested interleavings is low as commonly occuring interleavings should have been tested in a well-tested program.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/99765/1/jieyu_1.pd

    ćŠčçŽ‡è‰Żă„ăƒ†ă‚čăƒˆă‚±ăƒŒă‚čç”Ÿæˆă«ă‚ˆă‚‹äžŠèĄŒć‡Šç†ăƒ—ăƒ­ă‚°ăƒ©ăƒ ăźăƒ‡ăƒăƒƒă‚°ăšăƒ†ă‚čト

    Get PDF
    Debugging multi-threaded concurrent programs is more difficult than sequential programs because errors are not always reproducible. Re-executing or instrumenting a concurrent program for tracing might change the execution timing and might cause the concurrent program to take a different execution path. In other words, the exact timing that caused the error is unknown. In order to reproduce the error, one needs to execute the concurrent program with the same input values many times as test cases by changing interleavings, but it is not always feasible to test them all. This dissertation proposes a debugging/testing system that generates all possible executions as test cases based on the limited information obtained from an execution trace, and then detects potential race conditions caused by different schedules and interrupt timings on a concurrent multi-threaded program. There are a number of studies about test cases reduction using partial order reduction, but there are still redundancies for the purpose of checking race conditions. The objective is to efficiently reproduce concurrent errors, specifically race conditions, by proposing three methods. The first is to reduce the numbers of interleavings to be tested. This is achieved by reducing redundant test cases and eliminating infeasible ones. The originality of the proposed method is to exploit the nature of branch coverage and utilize data flows from the trace information to identify only those interleavings that affect branch outcomes, whereas existing methods try to identify all the interleavings which may affect shared variables. Since the execution paths with the same branch outcomes would have equivalent sequences of lock/unlock and read/write operations to shared variables, they can be grouped together in the same “race-equivalent” group. In order to reduce the task for reproducing race conditions, it is sufficient to check only one member of the group. In this way, the proposed method can significantly reduces the number of interleavings for testing while still capable of detecting the same race conditions. Furthermore, the proposed method extends the existing model of execution trace to identify and avoid generating infeasible interleavings due to dependency caused by lock/unlock and wait/notify mechanisms. Experimental results suggest that redundant interleavings can be identified and removed which leads to a significant reduction of test cases. We evaluated the proposed method against several concurrent Java programs. The experimental results for an open source program Apache Commons Pool show the number of test cases is reduced from 23, which is based on the existing Thread-Pair-Interleaving method (TPAIR), to only 2 by the proposed method. Moreover, for concurrent programs that contain infinite loops, the proposed method generates only a finite and very few numbers of test cases, while many existing methods generate an infinite number of test cases. The second is to reduce the memory space required for generating test cases. Redundant test cases were still generated by the existing reachability testing method even though there was no need to execute them. Here, we propose a new method by analyzing data dependency to generate only those test cases that might affect sequences of lock/unlock and read/write operations to shared variables. The experimental results for the Apache Commons Pool show that the size of the graph for creating the test cases is reduced from 990 nodes, as based on the reachability testing method used in our previous work, to only 4 nodes by our new method. The third improvement is to reduce the effort involved in checking race conditions by utilizing previous test results. Existing work requires checking race conditions in the whole execution trace for every new test case. The proposed method can identify only those parts of the execution trace in which the sequence of lock/unlock and read/write operations to shared variables might be affected by a new test case, thus necessitating that race conditions be rechecked only for those affected parts. From the new improvements introduced above, the proposed methods accomplish to significantly reduce the efforts for exhaustively checking all possible interleavings. The proposed methods provide programmers the information regarding whether there exist program errors caused by interleavings, the interleaving (path) when the errors occurred, and accesses to shared variables with inconsistent locking.é›»æ°—é€šäżĄć€§ć­Š201

    Guided Testing of Concurrent Programs Using Value Schedules

    Get PDF
    Testing concurrent programs remains a difficult task due to the non-deterministic nature of concurrent execution. Many approaches have been proposed to tackle the complexity of uncovering potential concurrency bugs. Static analysis tackles the problem by analyzing a concurrent program looking for situations/patterns that might lead to possible errors during execution. In general, static analysis cannot precisely locate all possible concurrent errors. Dynamic testing examines and controls a program during its execution also looking for situations/patterns that might lead to possible errors during execution. In general, dynamic testing needs to examine all possible execution paths to detect all errors, which is intractable. Motivated by these observation, a new testing technique is developed that uses a collaboration between static analysis and dynamic testing to find the first potential error but using less time and space. In the new collaboration scheme, static analysis and dynamic testing interact iteratively throughout the testing process. Static analysis provides coarse-grained flow-information to guide the dynamic testing through the relevant search space, while dynamic testing collects concrete runtime-information during the guided exploration. The concrete runtime-information provides feedback to the static analysis to refine its analysis, which is then feed forward to provide more precise guidance of the dynamic testing. The new collaborative technique is able to uncover the first concurrency-related bug in a program faster using less storage than the state-of-the-art dynamic testing-tool Java PathFinder. The implementation of the collaborative technique consists of a static-analysis module based on Soot and a dynamic-analysis module based on Java PathFinder

    Data Races vs. Data Race Bugs: Telling the Difference with Portend

    Get PDF
    Even though most data races are harmless, the harmful ones are at the heart of some of the worst concurrency bugs. Alas, spotting just the harmful data races in programs is like finding a needle in a haystack: 76%-90% of the true data races reported by state-of-the- art race detectors turn out to be harmless [45]. We present Portend, a tool that not only detects races but also automatically classifies them based on their potential con- sequences: Could they lead to crashes or hangs? Could their effects be visible outside the program? Are they harmless? Our proposed technique achieves high accuracy by efficiently analyzing multiple paths and multiple thread schedules in combination, and by performing symbolic comparison between program outputs. We ran Portend on 7 real-world applications: it detected 93 true data races and correctly classified 92 of them, with no human effort. 6 of them are harmful races. Portend’s classification accuracy is up to 88% higher than that of existing tools, and it produces easy- to-understand evidence of the consequences of harmful races, thus both proving their harmfulness and making debugging easier. We envision Portend being used for testing and debugging, as well as for automatically triaging bug reports

    Accurately Classifying Data Races with Portend

    Get PDF
    Even though most data races are harmless, the harmful ones are at the heart of some of the worst concurrency bugs. Eliminating all data races from programs is impractical (e.g., system performance could suffer severely), yet spotting just the harmful ones is like finding a needle in a haystack: state-of-the-art data race detectors and classifiers suffer from high false positive rates of 37%–84%. We present Portend, a technique and system for automatically triaging suspect data races based on their potential consequences: Could they lead to crashes or hangs? Alter system state? Could their effects be externalized? Or are they harmless? Our proposed technique achieves very high accuracy by efficiently analyzing multiple paths and multiple thread schedules in combination, and by performing symbolic comparison between program states. We ran Portend on several dozen data races from real-world applications, and it correctly classified all of them, with no human effort. It also produced easy-to-understand evidence of the consequences of harmful races, thus proving their harmfulness and making debugging easier. We envision using Portend for testing and debugging, as well as for automatically triaging bug reports

    AUTOMATIC CRITICAL SECTION DISCOVERY USING MEMORY USAGE PATTERNS.

    Get PDF
    Parallel programming introduces new types of bugs that are notoriously difficult to find. As a result researchers have put a significant amount of effort into creating tools and techniques to discover parallel bugs. One of these bugs is the violation of the assumption of atomicity-- the assumption that a region of code, called a critical section, executes without interruption from an outside operation. In this thesis, we introduce a new heuristic to infer critical sections using the temporal and spatial locality of critical sections and provide empirical results showing that the heuristic can infer critical sections in shared memory programs. Real critical sections in benchmark programs are completely covered by inferred critical sections up to 75% to 80% of the time. A programmer can use the reported critical sections to inform his addition of locks into the program

    Automatic Detection, Validation and Repair of Race Conditions in Interrupt-Driven Embedded Software

    Full text link
    Interrupt-driven programs are widely deployed in safety-critical embedded systems to perform hardware and resource dependent data operation tasks. The frequent use of interrupts in these systems can cause race conditions to occur due to interactions between application tasks and interrupt handlers (or two interrupt handlers). Numerous program analysis and testing techniques have been proposed to detect races in multithreaded programs. Little work, however, has addressed race condition problems related to hardware interrupts. In this paper, we present SDRacer, an automated framework that can detect, validate and repair race conditions in interrupt-driven embedded software. It uses a combination of static analysis and symbolic execution to generate input data for exercising the potential races. It then employs virtual platforms to dynamically validate these races by forcing the interrupts to occur at the potential racing points. Finally, it provides repair candidates to eliminate the detected races. We evaluate SDRacer on nine real-world embedded programs written in C language. The results show that SDRacer can precisely detect and successfully fix race conditions.Comment: This is a draft version of the published paper. Ke Wang provides suggestions for improving the paper and README of the GitHub rep
    • 

    corecore