561 research outputs found

    Efficiency Improvements in the Quality Assurance Process for Data Races

    Get PDF
    As the usage of concurrency in software has gained importance in the last years, and is still rising, new types of defects increasingly appeared in software. One of the most prominent and critical types of such new defect types are data races. Although research resulted in an increased effectiveness of dynamic quality assurance regarding data races, the efficiency in the quality assurance process still is a factor preventing widespread practical application. First, dynamic quality assurance techniques used for the detection of data races are inefficient. Too much effort is needed for conducting dynamic quality assurance. Second, dynamic quality assurance techniques used for the analysis of reported data races are inefficient. Too much effort is needed for analyzing reported data races and identifying issues in the source code. The goal of this thesis is to enable efficiency improvements in the process of quality assurance for data races by: (1) analyzing the representation of the dynamic behavior of a system under test. The results are used to focus instrumentation of this system, resulting in a lower runtime overhead during test execution compared to a full instrumentation of this system. (2) Analyzing characteristics and preprocessing of reported data races. The results of the preprocessing are then provided to developers and quality assurance personnel, enabling an analysis and debugging process, which is more efficient than traditional analysis of data race reports. Besides dynamic data race detection, which is complemented by the solution, all steps in the process of dynamic quality assurance for data races are discussed in this thesis. The solution for analyzing UML Activities for nodes possibly executing in parallel to other nodes or themselves is based on a formal foundation using graph theory. A major problem that has been solved in this thesis was the handling of cycles within UML Activities. This thesis provides a dynamic limit for the number of cycle traversals, based on the elements of each UML Activity to be analyzed and their semantics. Formal proofs are provided with regard to the creation of directed acyclic graphs and with regard to their analysis concerning the identification of elements that may be executed in parallel to other elements. Based on an examination of the characteristics of data races and data race reports, the results of dynamic data race detection are preprocessed and the outcome of this preprocessing is presented to users for further analysis. This thesis further provides an exemplary application of the solution idea, of the results of analyzing UML Activities, and an exemplary examination of the efficiency improvement of the dynamic data race detection, which showed a reduction in the runtime overhead of 44% when using the focused instrumentation compared to full instrumentation. Finally, a controlled experiment has been set up and conducted to examine the effects of the preprocessing of reported data races on the efficiency of analyzing data race reports. The results show that the solution presented in this thesis enables efficiency improvements in the analysis of data race reports between 190% and 660% compared to using traditional approaches. Finally, opportunities for future work are shown, which may enable a broader usage of the results of this thesis and further improvements in the efficiency of quality assurance for data races.Da die Verwendung von Concurrency in Software in den letzten Jahren an Bedeutung gewonnen hat, und immer noch gewinnt, sind zunehmend neue Arten von Fehlern in Software aufgetaucht. Eine der prominentesten und kritischsten Arten solcher neuer Fehlertypen sind data races. Auch wenn die Forschung zu einer steigenden Effektivität von Verfahren der dynamischen Qualitätssicherung geführt hat, so ist die Effizienz im Prozess der Qualitätssicherung noch immer ein Faktor, der eine weitverbreitete praktische Anwendung verhindert. Zum einen wird zu viel Aufwand benötigt, um dynamische Qualitätssicherung durchzuführen. Zum anderen sind die Verfahren zur Analyse gemeldeter data races ineffizient; es wird zu viel Aufwand benötigt, um gemeldete data races zu analysieren und Probleme im Quellcode zu identifizieren. Das Ziel dieser Dissertation ist es, Effizienzsteigerungen im Qualitätssicherungsprozess für data races zu ermöglichen, durch: (1) Analyse der Repräsentation des dynamischen Verhaltens des zu testenden Systems. Mit den Ergebnissen wird die Instrumentierung dieses Systems fokussiert, so dass ein im Vergleich zur vollen Instrumentierung des Systems geringerer Mehraufwand an Laufzeit benötigt wird. (2) Analyse der Charakteristiken von und Vorverarbeitung der gemeldeten data races. Die Ergebnisse der Vorverarbeitung werden Mitarbeitenden in der Entwicklung und Qualitätssicherung präsentiert, so dass ein Analyse- und Fehlerbehebungsprozess ermöglicht wird, welcher effizienter als traditionelle Analysen gemeldeter data races ist. Mit Ausnahme der dynamischen data race Erkennung, welche durch die Lösung komplementiert wird, werden alle Schritte im Prozess der dynamischen Qualitätssicherung für data races in dieser Dissertation behandelt. Die Lösung zur Analyse von UML Aktivitäten auf Knoten, die möglicherweise parallel zu sich selbst oder anderen Knoten ausgeführt werden, basiert auf einer formalen Grundlage aus dem Bereich der Graphentheorie. Eines der Hauptprobleme, welches gelöst wurde, war die Verarbeitung von Zyklen innerhalb der UML Aktivitäten. Diese Dissertation führt ein dynamisches Limit für die Anzahl an Zyklusdurchläufen ein, welches die Elemente jeder zu analysierenden UML Aktivität sowie deren Semantiken berücksichtigt. Ebenso werden formale Beweise präsentiert in Bezug auf die Erstellung gerichteter azyklischer Graphen, sowie deren Analyse zur Identifizierung von Elementen, die parallel zu anderen Elementen ausgeführt werden können. Auf Basis einer Untersuchung von Charakteristiken von data races sowie Meldungen von data races werden die Ergebnisse der dynamischen Erkennung von data races vorverarbeitet, und das Ergebnis der Vorverarbeitung gemeldeter data races wird Benutzern zur weiteren Analyse präsentiert. Diese Dissertation umfasst weiterhin eine exemplarische Anwendung der Lösungsidee und der Analyse von UML Aktivitäten, sowie eine exemplarische Untersuchung der Effizienzsteigerung der dynamischen Erkennung von data races. Letztere zeigte eine Reduktion des Mehraufwands an Laufzeit von 44% bei fokussierter Instrumentierung im Vergleich zu voller Instrumentierung auf. Abschließend wurde ein kontrolliertes Experiment aufgesetzt und durchgeführt, um die Effekte der Vorverarbeitung gemeldeter data races auf die Effizienz der Analyse dieser gemeldeten data races zu untersuchen. Die Ergebnisse zeigen, dass die in dieser Dissertation vorgestellte Lösung verglichen mit traditionellen Ansätzen Effizienzsteigerungen in der Analyse gemeldeter data races von 190% bis zu 660% ermöglicht. Abschließend werden Möglichkeiten für zukünftige Arbeiten vorgestellt, welche eine breitere Anwendung der Ergebnisse dieser Dissertation ebenso wie weitere Effizienzsteigerungen im Qualitätssicherungsprozess für data races ermöglichen können

    High-Performance and Time-Predictable Embedded Computing

    Get PDF
    Nowadays, the prevalence of computing systems in our lives is so ubiquitous that we live in a cyber-physical world dominated by computer systems, from pacemakers to cars and airplanes. These systems demand for more computational performance to process large amounts of data from multiple data sources with guaranteed processing times. Actuating outside of the required timing bounds may cause the failure of the system, being vital for systems like planes, cars, business monitoring, e-trading, etc. High-Performance and Time-Predictable Embedded Computing presents recent advances in software architecture and tools to support such complex systems, enabling the design of embedded computing devices which are able to deliver high-performance whilst guaranteeing the application required timing bounds. Technical topics discussed in the book include: Parallel embedded platforms Programming models Mapping and scheduling of parallel computations Timing and schedulability analysis Runtimes and operating systems The work reflected in this book was done in the scope of the European project P SOCRATES, funded under the FP7 framework program of the European Commission. High-performance and time-predictable embedded computing is ideal for personnel in computer/communication/embedded industries as well as academic staff and master/research students in computer science, embedded systems, cyber-physical systems and internet-of-things.info:eu-repo/semantics/publishedVersio

    Pingo: A Framework for the Management of Storage of Intermediate Outputs of Computational Workflows

    Get PDF
    abstract: Scientific workflows allow scientists to easily model and express the entire data processing steps, typically as a directed acyclic graph (DAG). These scientific workflows are made of a collection of tasks that usually take a long time to compute and that produce a considerable amount of intermediate datasets. Because of the nature of scientific exploration, a scientific workflow can be modified and re-run multiple times, or new scientific workflows are created that might make use of past intermediate datasets. Storing intermediate datasets has the potential to save time in computations. Since storage is limited, one main problem that needs a solution is determining which intermediate datasets need to be saved at creation time in order to minimize the computational time of the workflows to be run in the future. This research thesis proposes the design and implementation of Pingo, a system that is capable of managing the computations of scientific workflows as well as the storage, provenance and deletion of intermediate datasets. Pingo uses the history of workflows submitted to the system to predict the most likely datasets to be needed in the future, and subjects the decision of dataset deletion to the optimization of the computational time of future workflows.Dissertation/ThesisMasters Thesis Computer Science 201

    Adaptive On-the-Fly Changes in Distributed Processing Pipelines

    Get PDF
    Distributed data processing systems have become the standard means for big data analytics. These systems are based on processing pipelines where operations on data are performed in a chain of consecutive steps. Normally, the operations performed by these pipelines are set at design time, and any changes to their functionality require the applications to be restarted. This is not always acceptable, for example, when we cannot afford downtime or when a long-running calculation would lose significant progress. The introduction of variation points to distributed processing pipelines allows for on-the-fly updating of individual analysis steps. In this paper, we extend such basic variation point functionality to provide fully automated reconfiguration of the processing steps within a running pipeline through an automated planner. We have enabled pipeline modeling through constraints. Based on these constraints, we not only ensure that configurations are compatible with type but also verify that expected pipeline functionality is achieved. Furthermore, automating the reconfiguration process simplifies its use, in turn allowing users with less development experience to make changes. The system can automatically generate and validate pipeline configurations that achieve a specified goal, selecting from operation definitions available at planning time. It then automatically integrates these configurations into the running pipeline. We verify the system through the testing of a proof-of-concept implementation. The proof of concept also shows promising results when reconfiguration is performed frequently
    corecore