16 research outputs found

    HAM: Cross-cutting Concerns in Eclipse

    Get PDF
    As programs evolve, newly added functionality sometimes does no longer align with the original design, ending up scattered across the software system. Aspect mining tries to identify such cross-cutting concerns in a program to support maintenance, or as a first step towards an aspect-oriented program. Previous approaches to aspect mining applied static or dynamic program analysis techniques to a single version of a system.We leverage all versions from a system\u27s CVS history to mine aspect candidates with our Eclipse plug-in HAM: when a single CVS commit adds calls to the same (small) set of methods in many unrelated locations, these method calls are likely to be cross-cutting. HAM employs formal concept analysis to identify aspect candidates. Analysing one commit at a time makes the approach scale to industrial-sized programs. In an evaluation we mined cross-cutting concerns from Eclipse 3.2M3 and found that up to 90% of the top-10 aspect candidates are truly cross-cutting concerns

    Fine-Grained Source Code Tracking and Visualization in Commit History

    Get PDF
    CodeTracker is the current state-of-the-art program element change history generator with a reported precision and recall of 99.9% in method and variable tracking [1]. In this thesis, we extend the granularity of CodeTracker to support the tracking of control-flow blocks (e.g., for, while, if, try, catch, etc.) with a precision and recall of 98.12% and 97.62% respectively, providing researchers and developers with finer-grained information about the evolution of source code. We accompany this extension with a manually validated oracle, which includes the change histories of 1280 code blocks. These code blocks are contained within 200 methods from 20 open-source Java projects (10 methods from each project) comprising the method change history oracle created by Grund et al. [2]. We also present a code change history visualization and navigation tool for CodeTracker, named CodeTracker Visualizer, that overlays the GitHub user interface with change history information enabling users to track code elements directly from the commit page by simply selecting the desired code element. Finally, we compare CodeTracker’s block tracking precision and recall using two different tools that provide statement mappings, namely RefactoringMiner [3, 4], the current state-of-the-art refactoring detection tool, and GumTree [5], the current state-of-the-art Abstract Syntax Tree (AST) Diff tool. The enhanced version of CodeTracker along with the extended oracle are made publicly available to facilitate reproducibility and future research on code element tracking techniques [6]

    Changes and bugs mining and predicting development activities

    Get PDF
    Software development results in a huge amount of data: changes to source code are recorded in version archives, bugs are reported to issue tracking systems, and communications are archived in e-mails and newsgroups. In this thesis, we present techniques for mining version archives and bug databases to understand and support software development. First, we present techniques which mine version archives for fine-grained changes. We introduce the concept of co-addition of method calls, which we use to identify patterns that describe how methods should be called. We use dynamic analysis to validate these patterns and identify violations. The co-addition of method calls can also detect cross-cutting changes, which are an indicator for concerns that could have been realized as aspects in aspect-oriented programming. Second, we present techniques to build models that can successfully predict the most defectprone parts of large-scale industrial software, in our experiments Windows Server 2003. This helps managers to allocate resources for quality assurance to those parts of a system that are expected to have most defects. The proposed measures on dependency graphs outperformed traditional complexity metrics. In addition, we found empirical evidence for a domino effect: depending on defect-prone binaries increases the chances of having defects.Software-Entwicklung führt zu einer großen Menge an Daten: Änderungen des Quellcodes werden in Versionsarchiven, Fehler in Problemdatenbanken und Kommunikation in E-Mails und Newsgroups archiviert. In dieser Arbeit präsentieren wir Verfahren, die solche Datenbanken analysieren, um Software-Entwicklung zu verstehen und unterstützen. Zuerst präsentieren wir Techniken, die feinkörnige Änderungen in Versionsarchiven untersuchen. Wir konzentrieren uns dabei auf das gleichzeitige Hinzufügen von Methodenaufrufen und identifizieren Muster, die beschreiben wie Methoden aufgerufen werden sollen. Außerdem validieren wir diese Muster zur Laufzeit und erkennen Verletzungen. Das gleichzeitige Hinzufügen von Methodenaufrufen kann außerdem querschneidende Änderungen erkennen. Solche Änderungen sind typischerweise ein Indikator für querschneidende Funktionalitäten, die besser mit Aspekten und aspektorientierter Programmierung realisiert werden können. Zum Abschluss der Arbeit bauen wir Fehlervorhersagemodelle, die erfolgreich die Teile von Windows Server 2003 mit den meisten Fehlern vorhersagen können. Fehlervorhersagen helfen Managern, die Ressourcen für die Qualitätssicherung gezielt auf fehlerhafte Teile einer Software zu lenken. Die auf Abhängigkeitsgraphen basierenden Modelle erzielen dabei bessere Ergebnisse als Modelle, die auf traditionellen Komplexitätsmetriken basieren. Darüber hinaus haben wir einen Domino-Effekt beobachtet: Dateien, die von fehlerhaften Dateien abhängen, besitzen eine erhöhte Fehlerwahrscheinlichkeit

    Mining and checking object behavior

    Get PDF
    This thesis introduces a novel approach to modeling the behavior of programs at runtime. We leverage the structure of object-oriented programs to derive models that describe the behavior of individual objects. Our approach mines object behavior models, finite state automata where states correspond to different states of an object, and transitions are caused by method invocations. Such models capture the effects of method invocations on an object\u27;s state. To our knowledge, our approach is the first to combine the control-flow with information about the values of variables. Our ADABU tool is able to mine object behavior models from the executions of large interactive JAVA programs. To investigate the usefulness of our technique, we study two different applications of object behavior models: Mining Specifications Many existing verification techniques are difficult to apply because in practice the necessary specifications are missing. We use ADABU to automatically mine specifications from the execution of test suites. To enrich these specifications, our TAUTOKO tool systematically generates test cases that exercise previously uncovered behavior. Our results show that, when fed into a typestate verifier, such enriched specifications are able to detect more bugs than the original versions. Generating Fixes We present PACHIKA, a tool to automatically generate possible fixes for failing program runs. Our approach uses object behavior models to compare passing and failing runs. Differences in the models both point to anomalies and suggest possible ways to fix the anomaly. In a controlled experiment, PACHIKA was able to synthesize fixes for real bugs mined from the history of two open-source projects.Diese Arbeit stellt einen neuen Ansatz zur Modellierung des Verhaltens eines Programmes zur Laufzeit vor. Wir nutzen die Struktur Objektorientierter Programme aus um Modelle zu erzeugen, die das Verhalten einzelner Objekte beschreiben. Unser Ansatz generiert Objektverhaltensmodelle, endliche Automaten deren Zustände unterschiedlichen Zuständen des Objektes entsprechen. Zustandsübergänge im Automaten werden durch Methodenaufrufe ausgelöst. Diese Modelle erfassen die Auswirkungen von Methodenaufrufen auf den Zustand eines Objektes. Nach unserem Kenntnisstand ist unser Ansatz der Erste, der Informationen über den Kontrollfluss eines Programms mit den Werten von Variablen kombiniert. Unser ADABU Prototyp ist in der Lage, Objektverhaltensmodelle von Ausführungen großer JAVA Programme zu lernen. Um die Anwendbarkeit unseres Ansatzes in der Praxis zu untersuchen, haben wir zwei unterschiedliche Anwendungen von Objektverhaltensmodellen untersucht: Lernen von Spezifikationen: Viele Ansätze zur Programmverifikation sind in der Praxis schwierig zu verwenden, da die notwendigen Spezifikationen fehlen. Wir verwenden ADABU um Spezifikationen von der Ausführung automatischer Tests zu lernen. Um die Spezifikationen zu vervollständigen generiert der TAUTOKO Prototyp systematisch Tests, die gezielt neues Verhalten abtesten. Unsere Ergebnisse zeigen, dass derart vervollständigte Spezifikationen für ein spezielles Verifikationsverfahren namens \u27;Typestate Verification\u27; wesentlich mehr Fehler finden als die ursprünglichen Spezifikationen. Automatische Programmkorrektur: Wir stellen PACHIKA vor, ein Werkzeug das automatisch mögliche Programmkorrekturen für fehlerhafte Programmläufe vorschlägt. Unser Ansatz verwendet Objektverhaltensmodelle um das Verhalten von normalen und fehlerhaften Läufen zu vergleichen. Unterschiede in den Modellen weisen auf Anomalien hin und zeigen mögliche Korrekturen auf. In einem kontrollierten Experiment war PACHIKA in der Lage, Korrekturen für echte Fehler aus der Versionsgeschichte zweier quelloffener Programme zu generieren

    Mining the evolution of software component usage

    Get PDF
    The topic of this thesis is the analysis of the evolution of software components. In order to track the evolution of software components, one needs to collect the evolution information of each component. This information is stored in the version control system (VCS) of the project—the repository of the history of events happening throughout the project’s lifetime. By using software archive mining techniques one can extract and leverage this information. The main contribution of this thesis is the introduction of evolution usage trends and evolution change patterns. The raw information about the occurrences of each component is stored in the VCS of the project. By organizing it in evolution trends and patterns, we are able to draw conclusions and issue recommendations concerning each individual component and the project as a whole.Der Mittelpunkt dieser Arbeit ist die Analyse der Evolution von Software Komponenten. Um die Evolution von Software Komponenten verfolgen zu können, benötigt man Informationen über die Evolution jeder einzelnen Komponente. Diese Informationen sind gespeichert in Versionskontrollsystemen - den Speichern der kompletten Geschichte der Ereignisse, die sich in der Laufzeit eines Projektes zutragen. Der Hauptbeitrag dieser Arbeit ist die Einführung von evolutionären Nutzertrends und evolutionären Änderungsmustern. Die unverarbeiteten Informationen über die Verwendung jeder einzelnen Komponente ist in dem Versionskontrollsystem eines Projektes gespeichert, und durch die Organisierung in evolutionären Änderungsmustern und Trends können wir Schlüsse daraus ziehen und Empfehlungen aussprechen für jede einzelne Komponente und das Projekt als Ganzes

    Gestion des bibliothèques tierces dans un contexte de maintenance logicielle

    Get PDF
    Software depend on third-party libraries to reduce development and maintenance costs. Developers have access to robust functionalities through an application programming interface designed by these libraries. However, due to the strong relationship with these libraries, developers have to reconsider their position when the software evolves. In this thesis, we identify several re-search problems involving these third-party libraries in a context of software maintenance. More specifically, a library may not satisfy the software new requirements and has to be replaced by anew one. We call this operation a library migration.We leverage three points that characterize the impediments met by developers in this situation.To which library should they migrate ? How to migrate their software ? Who can help them in this case ? This thesis suggests answers and exposes several contributions to these problems. We define three approaches that are evaluated through several case studies. To achieve this work, weuse a methodology based on software evolution analysis to observe and understand how software change. We describe numerous perspectives to overcome the current limitations of our solutions.Les logiciels dépendent de bibliothèques tierces pour réduire les coûts liés à leur développement et à leur maintenance. Elles proposent un ensemble de fonctionnalités robustes dont les développeurs peuvent tirer parti depuis une interface de programmation. Cependant, cette forte dépendance entre un logiciel et ses bibliothèques oblige les développeurs à reconsidérer leur rôle lorsque le logiciel évolue. Dans cette thèse, nous identifions plusieurs problématiques impliquant les bibliothèques tierces dans un contexte de maintenance logicielle. Plus particulièrement, une bibliothèque peut ne plus répondre aux besoins d’un logiciel et doit être remplacée par une nouvelle.Nous nommons cette opération une migration de bibliothèque.Nous soulevons dans ce contexte trois points qui caractérisent les difficultés rencontrées par les développeurs. Vers quelle bibliothèque migrer ? Comment appliquer la migration ? Avec l’aide de quels développeurs ? Cette thèse discute de solutions et apporte des contributions autour de ces problèmes. Nous présentons plusieurs approches et les évaluons lors de différents cas d’étude. L’analyse de l’évolution logicielle sera notre support de travail, dont la méthodologie est basée sur l’observation des changements de logiciels. Nous décrivons les limites actuelles de nos contribu-tions et ouvrons des perspectives futures pour enrichir l’état de l’art dans ce domain

    Task-based parser output combination : workflow and infrastructure

    Get PDF
    This dissertation introduces the method of task-based parser output combination as a device to enhance the reliability of automatically generated syntactic information for further processing tasks. Parsers, i.e. tools generating syntactic analyses, are usually based on reference data. Typically these are modern news texts. However, the data relevant for applications or tasks beyond parsing often differs from this standard domain, or only specific phenomena from the syntactic analysis are actually relevant for further processing. In these cases, the reliability of the parsing output might deviate essentially from the expected outcome on standard news text. Studies for several levels of analysis in natural language processing have shown that combining systems from the same analysis level outperforms the best involved single system. This is due to different error distributions of the involved systems which can be exploited, e.g. in a majority voting approach. In other words: for an effective combination, the involved systems have to be sufficiently different. In these combination studies, usually the complete analyses are combined and evaluated. However, to be able to combine the analyses completely, a full mapping of their structures and tagsets has to be found. The need for a full mapping either restricts the degree to which the participating systems are allowed to differ or it results in information loss. Moreover, the evaluation of the combined complete analyses does not reflect the reliability achieved in the analysis of the specific aspects needed to resolve a given task. This work presents an abstract workflow which can be instantiated based on the respective task and the available parsers. The approach focusses on the task-relevant aspects and aims at increasing the reliability of their analysis. Moreover, this focus allows a combination of more diverging systems, since no full mapping of the structures and tagsets from the single systems is needed. The usability of this method is also increased by focussing on the output of the parsers: It is not necessary for the users to reengineer the tools. Instead, off-the-shelf parsers and parsers for which no configuration options or sources are available to the users can be included. Based on this, the method is applicable to a broad range of applications. For instance, it can be applied to tasks from the growing field of Digital Humanities, where the focus is often on tasks different from syntactic analysis
    corecore