2,804 research outputs found

    Efficient, Near Complete and Often Sound Hybrid Dynamic Data Race Prediction (extended version)

    Full text link
    Dynamic data race prediction aims to identify races based on a single program run represented by a trace. The challenge is to remain efficient while being as sound and as complete as possible. Efficient means a linear run-time as otherwise the method unlikely scales for real-world programs. We introduce an efficient, near complete and often sound dynamic data race prediction method that combines the lockset method with several improvements made in the area of happens-before methods. By near complete we mean that the method is complete in theory but for efficiency reasons the implementation applies some optimizations that may result in incompleteness. The method can be shown to be sound for two threads but is unsound in general. We provide extensive experimental data that shows that our method works well in practice.Comment: typos, appendi

    Hybrid Data Race Detection for Multicore Software

    Get PDF
    Multithreaded programs are prone to concurrency errors such as deadlocks, race conditions and atomicity violations. These errors are notoriously difficult to detect due to the non-deterministic nature of concurrent software running on multicore hardware. Data races result from the concurrent access of shared data by multiple threads and can result in unexpected program behaviors. Main dynamic data race detection techniques in the literature are happens-before and lockset algorithms which suffer from high execution time and memory overhead, miss many data races or produce a high number of false alarms. Our goal is to improve the performance of dynamic data race detection, while at the same time improving its accuracy by generating fewer false alarms. We develop a hybrid data race detection algorithm that is a combination of the happens-before and lockset algorithms in a tool. Rather than focusing on individual memory accesses by each thread, we focus on sequence of memory accesses by each thread, called a segment. This allows us to improve the performance of data race detection. We implement several optimizations on our hybrid data race detector and compare our technique with traditional happens-before and lockset detectors. The experiments are performed with C/C++ multithreaded benchmarks using Pthreads library from PARSEC suite and large applications such as Apache web server. Our experiments showed that our hybrid detector is 15 % faster than the happens-before detector and produces 50 % less potential data races than the lockset detector. Ultimately, a hybrid data race detector can improve the performance and accuracy of data race detection, enhancing its usability in practice

    Dynamic Analysis of Embedded Software

    Get PDF
    abstract: Most embedded applications are constructed with multiple threads to handle concurrent events. For optimization and debugging of the programs, dynamic program analysis is widely used to collect execution information while the program is running. Unfortunately, the non-deterministic behavior of multithreaded embedded software makes the dynamic analysis difficult. In addition, instrumentation overhead for gathering execution information may change the execution of a program, and lead to distorted analysis results, i.e., probe effect. This thesis presents a framework that tackles the non-determinism and probe effect incurred in dynamic analysis of embedded software. The thesis largely consists of three parts. First of all, we discusses a deterministic replay framework to provide reproducible execution. Once a program execution is recorded, software instrumentation can be safely applied during replay without probe effect. Second, a discussion of probe effect is presented and a simulation-based analysis is proposed to detect execution changes of a program caused by instrumentation overhead. The simulation-based analysis examines if the recording instrumentation changes the original program execution. Lastly, the thesis discusses data race detection algorithms that help to remove data races for correctness of the replay and the simulation-based analysis. The focus is to make the detection efficient for C/C++ programs, and to increase scalability of the detection on multi-core machines.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Dynamic Data-Race Detection Through the Fine-Grained Lens

    Get PDF
    Data races are among the most common bugs in concurrency. The standard approach to data-race detection is via dynamic analyses, which work over executions of concurrent programs, instead of the program source code. The rich literature on the topic has created various notions of dynamic data races, which are known to be detected efficiently when certain parameters (e.g., number of threads) are small. However, the fine-grained complexity of all these notions of races has remained elusive, making it impossible to characterize their trade-offs between precision and efficiency. In this work we establish several fine-grained separations between many popular notions of dynamic data races. The input is an execution trace ? with ? events, ? threads and ? locks. Our main results are as follows. First, we show that happens-before HB races can be detected in O(?? min(?, ?)) time, improving over the standard O(?? ?) bound when ? = o(?). Moreover, we show that even reporting an HB race that involves a read access is hard for 2-orthogonal vectors (2-OV). This is the first rigorous proof of the conjectured quadratic lower-bound in detecting HB races. Second, we show that the recently introduced synchronization-preserving races are hard to detect for 3-OV and thus have a cubic lower bound, when ? = ?(?). This establishes a complexity separation from HB races which are known to be strictly less expressive. Third, we show that lock-cover races are hard for 2-OV, and thus have a quadratic lower-bound, even when ? = 2 and ? = ?(log ?). The similar notion of lock-set races is known to be detectable in O(?? ?) time, and thus we achieve a complexity separation between the two. Moreover, we show that lock-set races become hitting-set (HS)-hard when ? = ?(?), and thus also have a quadratic lower bound, when the input is sufficiently complex. To our knowledge, this is the first work that characterizes the complexity of well-established dynamic race-detection techniques, allowing for a rigorous comparison between them

    Efficiency Improvements in the Quality Assurance Process for Data Races

    Get PDF
    As the usage of concurrency in software has gained importance in the last years, and is still rising, new types of defects increasingly appeared in software. One of the most prominent and critical types of such new defect types are data races. Although research resulted in an increased effectiveness of dynamic quality assurance regarding data races, the efficiency in the quality assurance process still is a factor preventing widespread practical application. First, dynamic quality assurance techniques used for the detection of data races are inefficient. Too much effort is needed for conducting dynamic quality assurance. Second, dynamic quality assurance techniques used for the analysis of reported data races are inefficient. Too much effort is needed for analyzing reported data races and identifying issues in the source code. The goal of this thesis is to enable efficiency improvements in the process of quality assurance for data races by: (1) analyzing the representation of the dynamic behavior of a system under test. The results are used to focus instrumentation of this system, resulting in a lower runtime overhead during test execution compared to a full instrumentation of this system. (2) Analyzing characteristics and preprocessing of reported data races. The results of the preprocessing are then provided to developers and quality assurance personnel, enabling an analysis and debugging process, which is more efficient than traditional analysis of data race reports. Besides dynamic data race detection, which is complemented by the solution, all steps in the process of dynamic quality assurance for data races are discussed in this thesis. The solution for analyzing UML Activities for nodes possibly executing in parallel to other nodes or themselves is based on a formal foundation using graph theory. A major problem that has been solved in this thesis was the handling of cycles within UML Activities. This thesis provides a dynamic limit for the number of cycle traversals, based on the elements of each UML Activity to be analyzed and their semantics. Formal proofs are provided with regard to the creation of directed acyclic graphs and with regard to their analysis concerning the identification of elements that may be executed in parallel to other elements. Based on an examination of the characteristics of data races and data race reports, the results of dynamic data race detection are preprocessed and the outcome of this preprocessing is presented to users for further analysis. This thesis further provides an exemplary application of the solution idea, of the results of analyzing UML Activities, and an exemplary examination of the efficiency improvement of the dynamic data race detection, which showed a reduction in the runtime overhead of 44% when using the focused instrumentation compared to full instrumentation. Finally, a controlled experiment has been set up and conducted to examine the effects of the preprocessing of reported data races on the efficiency of analyzing data race reports. The results show that the solution presented in this thesis enables efficiency improvements in the analysis of data race reports between 190% and 660% compared to using traditional approaches. Finally, opportunities for future work are shown, which may enable a broader usage of the results of this thesis and further improvements in the efficiency of quality assurance for data races.Da die Verwendung von Concurrency in Software in den letzten Jahren an Bedeutung gewonnen hat, und immer noch gewinnt, sind zunehmend neue Arten von Fehlern in Software aufgetaucht. Eine der prominentesten und kritischsten Arten solcher neuer Fehlertypen sind data races. Auch wenn die Forschung zu einer steigenden Effektivität von Verfahren der dynamischen Qualitätssicherung geführt hat, so ist die Effizienz im Prozess der Qualitätssicherung noch immer ein Faktor, der eine weitverbreitete praktische Anwendung verhindert. Zum einen wird zu viel Aufwand benötigt, um dynamische Qualitätssicherung durchzuführen. Zum anderen sind die Verfahren zur Analyse gemeldeter data races ineffizient; es wird zu viel Aufwand benötigt, um gemeldete data races zu analysieren und Probleme im Quellcode zu identifizieren. Das Ziel dieser Dissertation ist es, Effizienzsteigerungen im Qualitätssicherungsprozess für data races zu ermöglichen, durch: (1) Analyse der Repräsentation des dynamischen Verhaltens des zu testenden Systems. Mit den Ergebnissen wird die Instrumentierung dieses Systems fokussiert, so dass ein im Vergleich zur vollen Instrumentierung des Systems geringerer Mehraufwand an Laufzeit benötigt wird. (2) Analyse der Charakteristiken von und Vorverarbeitung der gemeldeten data races. Die Ergebnisse der Vorverarbeitung werden Mitarbeitenden in der Entwicklung und Qualitätssicherung präsentiert, so dass ein Analyse- und Fehlerbehebungsprozess ermöglicht wird, welcher effizienter als traditionelle Analysen gemeldeter data races ist. Mit Ausnahme der dynamischen data race Erkennung, welche durch die Lösung komplementiert wird, werden alle Schritte im Prozess der dynamischen Qualitätssicherung für data races in dieser Dissertation behandelt. Die Lösung zur Analyse von UML Aktivitäten auf Knoten, die möglicherweise parallel zu sich selbst oder anderen Knoten ausgeführt werden, basiert auf einer formalen Grundlage aus dem Bereich der Graphentheorie. Eines der Hauptprobleme, welches gelöst wurde, war die Verarbeitung von Zyklen innerhalb der UML Aktivitäten. Diese Dissertation führt ein dynamisches Limit für die Anzahl an Zyklusdurchläufen ein, welches die Elemente jeder zu analysierenden UML Aktivität sowie deren Semantiken berücksichtigt. Ebenso werden formale Beweise präsentiert in Bezug auf die Erstellung gerichteter azyklischer Graphen, sowie deren Analyse zur Identifizierung von Elementen, die parallel zu anderen Elementen ausgeführt werden können. Auf Basis einer Untersuchung von Charakteristiken von data races sowie Meldungen von data races werden die Ergebnisse der dynamischen Erkennung von data races vorverarbeitet, und das Ergebnis der Vorverarbeitung gemeldeter data races wird Benutzern zur weiteren Analyse präsentiert. Diese Dissertation umfasst weiterhin eine exemplarische Anwendung der Lösungsidee und der Analyse von UML Aktivitäten, sowie eine exemplarische Untersuchung der Effizienzsteigerung der dynamischen Erkennung von data races. Letztere zeigte eine Reduktion des Mehraufwands an Laufzeit von 44% bei fokussierter Instrumentierung im Vergleich zu voller Instrumentierung auf. Abschließend wurde ein kontrolliertes Experiment aufgesetzt und durchgeführt, um die Effekte der Vorverarbeitung gemeldeter data races auf die Effizienz der Analyse dieser gemeldeten data races zu untersuchen. Die Ergebnisse zeigen, dass die in dieser Dissertation vorgestellte Lösung verglichen mit traditionellen Ansätzen Effizienzsteigerungen in der Analyse gemeldeter data races von 190% bis zu 660% ermöglicht. Abschließend werden Möglichkeiten für zukünftige Arbeiten vorgestellt, welche eine breitere Anwendung der Ergebnisse dieser Dissertation ebenso wie weitere Effizienzsteigerungen im Qualitätssicherungsprozess für data races ermöglichen können

    Artificial Intelligence in Criminal Justice Settings:: Where should be the limits of Artificial Intelligence in legal decision-making? Should an AI device make a decision about human justice?

    Get PDF
    The application of Artificial Intelligence (AI) systems for high-stakes decision making is currently out for debate. In the Criminal Justice System, it can provide great benefits as well as aggravate systematic biases and introduce unprecedented ones. Hence, should artificial devices be involved in the decision-making process? And if the answer is affirmative, where should be the limits of that involvement? To answer these questions, this dissertation examines two popular risk assessment tools currently in use in the United States, LS and COMPAS, to discuss the differences between a traditional and an actuarial instrument that rely on computerized algorithms. Further analysis of the later is done in relation with the Fairness, Accountability, Transparency and Ethics (FATE) perspective to be implemented in any technology involving AI. Although the future of AI is uncertain, the ignorance with respect to so many aspects of this kind of innovative methods demand further research on how to make the best use of the several opportunities that it brings
    corecore