8 research outputs found

    Mining and untangling change genealogies

    Get PDF
    Developers change source code to add new functionality, fix bugs, or refactor their code. Many of these changes have immediate impact on quality or stability. However, some impact of changes may become evident only in the long term. This thesis makes use of change genealogy dependency graphs modeling dependencies between code changes capturing how earlier changes enable and cause later ones. Using change genealogies, it is possible to: (a) applyformalmethodslikemodelcheckingonversionarchivestorevealtemporal process patterns. Such patterns encode key features of the software process and can be validated automatically: In an evaluation of four open source histories, our prototype would recommend pending activities with a precision of 60—72%. (b) classify the purpose of code changes. Analyzing the change dependencies on change genealogies shows that change genealogy network metrics can be used to automatically separate bug fixing from feature implementing code changes. (c) build competitive defect prediction models. Defect prediction models based on change genealogy network metrics show competitive prediction accuracy when compared to state-of-the-art defect prediction models. As many other approaches mining version archives, change genealogies and their applications rely on two basic assumptions: code changes are considered to be atomic and bug reports are considered to refer to corrective maintenance tasks. In a manual examination of more than 7,000 issue reports and code changes from bug databases and version control systems of open- source projects, we found 34% of all issue reports to be misclassified and that up to 15% of all applied issue fixes consist of multiple combined code changes serving multiple developer maintenance tasks. This introduces bias in bug prediction models confusing bugs and features. To partially solve these issues we present an approach to untangle such combined changes with a mean success rate of 58—90% after the fact.Softwareentwickler ändern Source-Code um neue Funktionalität hinzuzufügen, Bugs zu beheben oder um ihren Code zu restrukturieren. Viele dieser Änderungen haben einen direkten Einfluss auf Qualität und Stabilität des Softwareprodukts. Jedoch kommen einige dieser Einflüsse erst zu einem späteren Zeitpunkt zur Geltung. Diese Arbeit verwendet Genealogien zwischen Code-Änderungen um zu erfassen, wie frühere Änderungen spätere Änderungen erfordern oder ermöglichen. Die Verwendung von Änderungs-Genealogien ermöglicht: (a) die Anwendung formaler Methoden wie Model-Checking auf Versionsarchive um temporäre Prozessmuster zu erkennen. Solche Prozessmuster verdeutlichen Hauptmerkmale eines Softwareentwicklungsprozesses: In einer Evaluation auf vier Open-Source Projekten war unser Prototyp im Stande noch ausstehende Änderungen mit einer Präzision von 60–72% vorherzusagen. (b) die Absicht einer Code-Änderung zu bestimmen. Analysen von Änderungsabhängigkeiten zeigen, dass Netzwerkmetriken auf Änderungsgenealogien geeignet sind um fehlerbehebende Änderungen von Änderungen die eine Funktionalität hinzufügen zu trennen. (c) konkurrenzfähige Fehlervorhersagen zu erstellen. Fehlervorhersagen basierend auf Genealogie-Metriken können sich mit anerkannten Fehlervorhersagemodellen messen. Änderungs-Genealogien und deren Anwendungen basieren, wie andere Data-Mining Ansätze auch, auf zwei fundamentalen Annahmen: Code-Änderungen beabsichtigen die Lösung nur eines Problems und Bug-Reports weisen auf Fehler korrigierende Tätigkeiten hin. Eine manuelle Inspektion von mehr als 7.000 Issue-Reports und Code-Änderungen hat ergeben, dass 34% aller Issue-Reports falsch klassifiziert sind und dass bis zu 15% aller fehlerbehebender Änderungen mehr als nur einem Entwicklungs-Task dienen. Dies wirkt sich negativ auf Vorhersagemodelle aus, die nicht mehr klar zwischen Bug-Fixes und anderen Änderungen unterscheiden können. Als Lösungsansatz stellen wir einen Algorithmus vor, der solche nicht eindeutigen Änderungen mit einer Erfolgsrate von 58–90% entwirrt

    WarningsGuru - Analysing historical commits, augmenting software bug prediction models with warnings and a user study

    Get PDF
    The detection of bugs in software is divided into two research fields. Static analysis report warnings at the line level and are often false positives. Statistical models use historical change measures to predict bugs in commits at a higher level. We developed a tool which combines both of these approaches. Our tool analyses each commit of a project and identifies which commit introduced a warning. It processed over 45k commits, more then previous research. We propose an augmented bug model which includes static analysis measures which found that a twofold increase in the number of new warnings increases the odds of introducing a bug 1.5 times. Overall, our model accounts for 22% of the deviance which is an improvement over the 19.5% baseline. We demonstrate that we can use simple measure to predict new security warnings with a deviance explained of 30% and that recent development experience and more co-developers reduces the number of security warnings by 8%. We perform a user study of developers who introduced new warnings in 37 projects. We found that 53% and 21% of warnings in Findbugs and Jlint respectively are useful. We analysed the time delta between the introduction and response of the developer to the notification of the warning. We hypothise that remembering the context of the change as an impact on the perceived usefulness given useful warnings had a median of 11.5 versus 23 days for non useful warning

    Issues with SZZ: An empirical assessment of the state of practice of defect prediction data collection

    Get PDF
    Defect prediction research has a strong reliance on published data sets that are shared between researchers. The SZZ algorithm is the de facto standard for collecting defect labels for this kind of data and is used by most public data sets. Thus, problems with the SZZ algorithm may have a strong indirect impact on almost the complete state of the art of defect prediction. Recent research uncovered potential problems in different parts of the SZZ algorithm. Within this article, we provide an extensive empirical analysis of the defect labels created with the SZZ algorithm. We used a combination of manual validation and adopted or improved heuristics for the collection of defect data to establish ground truth data for bug fixing commits, improved the heuristic for the identification of inducing changes for defects, as well as the assignment of bugs to releases. We conducted an empirical study on 398 releases of 38 Apache projects and found that only half of the bug fixing commits determined by SZZ are actually bug fixing. Moreover, if a six month time frame is used in combination with SZZ to determine which bugs affect a release, one file is incorrectly labeled as defective for every file that is correctly labeled as defective. In addition, two defective files are missed. We also explored the impact of the relatively small set of features that are available in most defect prediction data sets, as there are multiple publications that indicate that, e.g., churn related features are important for defect prediction. We found that the difference of using more features is negligible.Comment: Submitted and under review. First three authors are equally contributin

    プログラムの解析、テスト、修復のための表現学習

    Get PDF
    学位の種別: 課程博士審査委員会委員 : (主査)東京大学特任准教授 松尾 豊, 東京大学教授 和泉 潔, 東京大学准教授 阿部 力也, 東京大学准教授 森 純一郎, 国立情報学研究所教授 蓮尾 一郎University of Tokyo(東京大学
    corecore