5 research outputs found

    GIMO : A multi-objective anytime rule mining system to ease iterative feedback from domain experts

    Get PDF
    Data extracted from software repositories is used intensively in Software Engineering research, for example, to predict defects in source code. In our research in this area, with data from open source projects as well as an industrial partner, we noticed several shortcomings of conventional data mining approaches for classification problems: (1) Domain experts’ acceptance is of critical importance, and domain experts can provide valuable input, but it is hard to use this feedback. (2) Evaluating the quality of the model is not a matter of calculating AUC or accuracy. Instead, there are multiple objectives of varying importance with hard to quantify trade-offs. Furthermore, the performance of the model cannot be evaluated on a per-instance level in our case, because it shares aspects with the set cover problem. To overcome these problems, we take a holistic approach and develop a rule mining system that simplifies iterative feedback from domain experts and can incorporate the domain-specific evaluation needs. A central part of the system is a novel multi-objective anytime rule mining algorithm. The algorithm is based on the GRASP-PR meta-heuristic but extends it with ideas from several other approaches. We successfully applied the system in the industrial context. In the current article, we focus on the description of the algorithm and the concepts of the system. We make an implementation of the system available. © 2020 The Author

    Cognitive-support code review tools : improved efficiency of change-based code review by guiding and assisting reviewers

    Get PDF
    Code reviews, i.e., systematic manual checks of program source code by other developers, have been an integral part of the quality assurance canon in software engineering since their formalization by Michael Fagan in the 1970s. Computer-aided tools supporting the review process have been known for decades and are now widely used in software development practice. Despite this long history and widespread use, current tools hardly go beyond simple automation of routine tasks. The core objective of this thesis is to systematically develop options for improved tool support for code reviews and to evaluate them in the interplay of research and practice. The starting point of the considerations is a comprehensive analysis of the state of research and practice. Interview and survey data collected in this thesis show that review processes in practice are now largely change-based, i.e., based on checking the changes resulting from the iterative-incremental evolution of software. This is true not only for open source projects and large technology companies, as shown in previous research, but across the industry. Despite the common change-based core process, there are various differences in the details of the review processes. The thesis shows possible factors influencing these differences. Important factors seem to be the process variants supported and promoted by the used review tool. In contrast, the used tool has little influence on the fundamental decision to use regular code reviews. Instead, the interviews and survey data suggest that the decision to use code reviews depends more on cultural factors. Overall, the analysis of the state of research and practice shows that there is a potential for developing better code review tools, and this potential is associated with the opportunity to increase efficiency in software development. The present thesis argues that the most promising approach for better review support is reducing the reviewer's cognitive load when reviewing large code changes. Results of a controlled experiment support this reasoning. The thesis explores various possibilities for cognitive support, two of these in detail: Guiding the reviewer by identifying and presenting a good order of reading the code changes being reviewed, and assisting the reviewer through automatic determination of change parts that are irrelevant for review. In both cases, empirical data is used to both generate and test hypotheses. In order to demonstrate the practical suitability of the techniques, they are also used in a partner company in regular development practice. For this evaluation of the cognitive support techniques in practice, a review tool which is suitable for use in the partner company and as a platform for review research is needed. As such a tool was not available, the code review tool "CoRT" has been developed. Here, too, a combination of an analysis of the state of research, support of design decisions through scientific studies and evaluation in practical use was employed. Overall, the results of this thesis can be roughly divided into three blocks: Researchers and practitioners working on improving review tools receive an empirically and theoretically sound catalog of requirements for cognitive-support review tools. It is available explicitly in the form of essential requirements and possible forms of realization, and additionally implicitly in the form of the tool "CoRT". The second block consists of contributions to the fundamentals of review research, ranging from the comprehensive analysis of review processes in practice to the analysis of the impact of cognitive abilities (specifically, working memory capacity) on review performance. As the third block, innovative methodological approaches have been developed within this thesis, e.g., the use of process simulation for the development of heuristics for development teams and new approaches in repository and data mining

    Klassifikation von Anforderungen und Informationen zur Unterstützung von Qualitätssicherungsprozessen

    Get PDF
    Anforderungsdokumente werden im Anforderungsmanagement verwendet, um Eigenschaften und Verhalten von Systemen zu dokumentieren. In der Automobilindustrie werden diese Dokumente verwendet, um die von Zulieferern zu fertigenden Komponenten zu beschreiben und um für die Kommunikation zwischen Zulieferer und Konzern eine rechtliche Grundlage zu schaffen. Daher müssen diese Dokumente diversen Qualitätsstandards und Qualitätsrichtlinien entsprechen. In manuellen Reviews werden Anforderungsdokumente gegen diese Richtlinien geprüft. Eine Richtlinie besagt, dass in Anforderungsdokumenten eine klare Trennung zwischen rechtlich verbindlichen Anforderungen und sogenannten Zusatzinformationen (Abbildungen, Erläuterungen, Beispiele, Verweise, etc.) existieren muss. Dazu wird jedes Objekt entsprechend dem Inhalt mit einem Objekttyp annotiert. Die Überprüfung der Korrektheit des Objekttyps ist ein zeitaufwändiger und fehleranfälliger Prozess, da Anforderungsdokumente in der Regel mehrere tausend Objekte umfassen. In dieser Arbeit wird am Beispiel des Reviews des Objekttyps untersucht, ob und in welcher Art und Weise der Reviewprozess durch den Einsatz von maschinellem Lernen unterstützt werden kann. Dazu wird zuerst ein Klassifikator trainiert, der in der Lage ist, zwischen Anforderungen und Zusatzinformationen zu unterscheiden. Ein darauf basierendes Werkzeug ist in der Lage, Anwender bei der Überprüfung des Objekttyps durch Hinweise und Warnungen zu unterstützten. In empirischen Studien wird untersucht, ob Anwender durch den Einsatz des Werkzeugs das Review von Anforderungsdokumenten besser durchführen können. Die Ergebnisse zeigen, dass Anwender nicht nur mehr falsch klassifizierte Objekte finden, sondern auch durchschnittlich 60% der für das Review verwendeten Zeit einsparen können. Durch die Übertragung des Ansatzes auf ein weiteres Klassifikationsproblem wird zudem gezeigt, dass der Einsatz von Werkzeugen nicht nur auf den Anwendungsfall Objekttypklassifikation beschränkt ist, sondern potenziell auf viele weitere zu überprüfende Richtlinien übertragbar ist
    corecore