28 research outputs found

    A Survey on Design Pattern Recovery Techniques

    Get PDF
    The evaluation of design pattern recovery techniques and tools is significant as numbers of emergent techniques are presented and used in the past to recover patterns from source code of legacy applications. The problem of very diverse precision and recall values extracted by different pattern recovery techniques and tools on the same examined applications is not investigated thoroughly. It is very desirable to compare features of existing techniques as abundance of techniques supplemented with different tools has been presented in the last decade. We believe that new innovations for this discipline can be based on the empirical evaluation of existing techniques. The selected techniques cover the whole spectrum of state of the art research in design pattern recovery. The major contribution of this paper is a comprehensive discussion on state of the art in design pattern recovery research in the last decade followed by a proposed framework for classification and evaluation of existing design pattern recovery techniques. Finally we listed our observations as lessons learned which hamper design pattern recovery research and these observations can be used for future research directions and guidelines for this discipline

    A Study on Software Testability and the Quality of Testing in Object-Oriented Systems

    Get PDF
    Software testing is known to be important to the delivery of high-quality systems, but it is also challenging, expensive and time-consuming. This has motivated academic and industrial researchers to seek ways to improve the testability of software. Software testability is the ease with which a software artefact can be effectively tested. The first step towards building testable software components is to understand the factors – of software processes, products and people – that are related to and can influence software testability. In particular, the goal of this thesis is to provide researchers and practitioners with a comprehensive understanding of design and source code factors that can affect the testability of a class in object oriented systems. This thesis considers three different views on software testability that address three related aspects: 1) the distribution of unit tests in relation to the dynamic coupling and centrality of software production classes, 2) the relationship between dynamic (i.e., runtime) software properties and class testability, and 3) the relationship between code smells, test smells and the factors related to smells distribution. The thesis utilises a combination of source code analysis techniques (both static and dynamic), software metrics, software visualisation techniques and graph-based metrics (from complex networks theory) to address its goals and objectives. A systematic mapping study was first conducted to thoroughly investigate the body of research on dynamic software metrics and to identify issues associated with their selection, design and implementation. This mapping study identified, evaluated and classified 62 research works based on a pre-tested protocol and a set of classification criteria. Based on the findings of this study, a number of dynamic metrics were selected and used in the experiments that were then conducted. The thesis demonstrates that by using a combination of visualisation, dynamic analysis, static analysis and graph-based metrics it is feasible to identify central classes and to diagrammatically depict testing coverage information. Experimental results show that, even in projects with high test coverage, some classes appear to be left without any direct unit testing, even though they play a central role during a typical execution profile. It is contended that the proposed visualisation techniques could be particularly helpful when developers need to maintain and reengineer existing test suites. Another important finding of this thesis is that frequently executed and tightly coupled classes are correlated with the testability of the class – such classes require larger unit tests and more test cases. This information could inform estimates of the effort required to test classes when developing new unit tests or when maintaining and refactoring existing tests. An additional key finding of this thesis is that test and code smells, in general, can have a negative impact on class testability. Increasing levels of size and complexity in code are associated with the increased presence of test smells. In addition, production classes that contain smells generally require larger unit tests, and are also likely to be associated with test smells in their associated unit tests. There are some particular smells that are more significantly associated with class testability than other smells. Furthermore, some particular code smells can be seen as a sign for the presence of test smells, as some test and code smells are found to co-occur in the test and production code. These results suggest that code smells, and specifically certain types of smells, as well as measures of size and complexity, can be used to provide a more comprehensive indication of smells likely to emerge in test code produced subsequently (or vice versa in a test-first context). Such findings should contribute positively to the work of testers and maintainers when writing unit tests and when refactoring and maintaining existing tests

    Supporting Source Code Feature Analysis Using Execution Trace Mining

    Get PDF
    Software maintenance is a significant phase of a software life-cycle. Once a system is developed the main focus shifts to maintenance to keep the system up to date. A system may be changed for various reasons such as fulfilling customer requirements, fixing bugs or optimizing existing code. Code needs to be studied and understood before any modification is done to it. Understanding code is a time intensive and often complicated part of software maintenance that is supported by documentation and various tools such as profilers, debuggers and source code analysis techniques. However, most of the tools fail to assist in locating the portions of the code that implement the functionality the software developer is focusing. Mining execution traces can help developers identify parts of the source code specific to the functionality of interest and at the same time help them understand the behaviour of the code. We propose a use-driven hybrid framework of static and dynamic analyses to mine and manage execution traces to support software developers in understanding how the system's functionality is implemented through feature analysis. We express a system's use as a set of tests. In our approach, we develop a set of uses that represents how a system is used or how a user uses some specific functionality. Each use set describes a user's interaction with the system. To manage large and complex traces we organize them by system use and segment them by user interface events. The segmented traces are also clustered based on internal and external method types. The clusters are further categorized into groups based on application programming interfaces and active clones. To further support comprehension we propose a taxonomy of metrics which are used to quantify the trace. To validate the framework we built a tool called TrAM that implements trace mining and provides visualization features. It can quantify the trace method information, mine similar code fragments called active clones, cluster methods based on types, categorise them based on groups and quantify their behavioural aspects using a set of metrics. The tool also lets the users visualize the design and implementation of a system using images, filtering, grouping, event and system use, and present them with values calculated using trace, group, clone and method metrics. We also conducted a case study on five different subject systems using the tool to determine the dynamic properties of the source code clones at runtime and answer three research questions using our findings. We compared our tool with trace mining tools and profilers in terms of features, and scenarios. Finally, we evaluated TrAM by conducting a user study on its effectiveness, usability and information management

    Supporting feature-level software maintenance

    Get PDF
    Software maintenance is the process of modifying a software system to fix defects, improve performance, add new functionality, or adapt the system to a new environment. A maintenance task is often initiated by a bug report or a request for new functionality. Bug reports typically describe problems with incorrect behaviors or functionalities. These behaviors or functionalities are known as features. Even in very well-designed systems, the source code that implements features is often not completely modularized. The delocalized nature of features makes maintaining them challenging. Since maintenance tasks are expressed in terms of features, the goal of this dissertation is to support software maintenance at the feature-level. We focus on two tasks in particular: feature location and impact analysis via feature coupling.;Feature location is the process of identifying the source code that implements a feature, and it is an essential first step to any maintenance task. There are many existing techniques for feature location that incorporate various types of analyses such as static, dynamic, and textual. In this dissertation, we recognize the advantages of leveraging several types of analyses and introduce a new approach to feature location based on combining dynamic analysis, textual analysis, and web mining algorithms applied to software. The use of web mining for feature location is a novel contribution, and we show that our new techniques based on web mining are significantly more effective than the current state of the art.;After using feature location to identify a feature\u27s source code, maintenance can be completed on that feature. Impact analysis should then be performed to revalidate the system and determine which other features may have been affected by the modifications. We define three feature coupling metrics that capture the relationship between features based on structural information, textual information, and their combination. Our novel feature coupling metrics can be used for impact analysis to quantify the strength of coupling between pairs of features. We performed three empirical studies on open-source software systems to assess the feature coupling metrics and established three major results. First, there is a moderate to strong statistically significant correlation between feature coupling and faults. Second, feature coupling can be used to correctly determine about half of the other features that would be affected by a change to a given feature. Finally, we found that the metrics align with developers\u27 opinions about pairs of features that are actually coupled

    Visualization of graphs and trees for software analysis

    Get PDF
    A software architecture is an abstraction of a software system, which is indispensable for many software engineering tasks. Unfortunately, in many cases information pertaining to the software architecture is not available, outdated, or inappropriate for the task at hand. The RECONSTRUCTOR project focuses on software architecture reconstruction, i.e., obtaining architectural information from an existing system. Our research, which is part of RECONSTRUCTOR, focuses on interactive visualization and tries to answer the following question: How can users be enabled to understand the large amounts of information relevant for program understanding using visual representations? To answer this question, we have iteratively developed a number of techniques for visualizing software systems. A large number of these cases consists of hierarchically organized data, combined with adjacency relations. Examples are function calls within a hierarchically organized software system and correspondence relations between two different versions of a hierarchically organized software system. Hierarchical Edge Bundles (HEBs) are used to visualize adjacency relations in hierarchically organized data, such as the aforementioned function calls within a software system. HEBs significantly reduce visual clutter by visually bundling relations together. Massive Sequence Views (MSVs) are used in conjunction with HEBs to enable analysis of sequences of relations, such as function-call traces. HEBs are furthermore used to visually compare hierarchically organized data, e.g., two different versions of a software system. HEBs visually emphasize splits, joins, and relocations of subhierarchies and provide for interactive selection of sets of relations. Since HEBs require a hierarchy to perform the bundling, we present Force-Directed Edge Bundles (FDEBs) as an alternative to visually bundle relations together in the absence of a hierarchical component. FDEBs use a self-organizing approach to bundling in which edges are modeled as flexible springs that can attract each other. As a result, visual clutter is reduced and high-level edge patterns are better visible. Finally, in all these methods, a clear depiction of the direction of edges is important. We have therefore performed a separate study in which we evaluated ten representations (including the standard arrow) for depicting directed edges in a controlled user study

    Augmenting IDEs with Runtime Information for Software Maintenance

    Get PDF
    Object-oriented language features such as inheritance, abstract types, late-binding, or polymorphism lead to distributed and scattered code, rendering a software system hard to understand and maintain. The integrated development environment (IDE), the primary tool used by developers to maintain software systems, usually purely operates on static source code and does not reveal dynamic relationships between distributed source artifacts, which makes it difficult for developers to understand and navigate software systems. Another shortcoming of today's IDEs is the large amount of information with which they typically overwhelm developers. Large software systems encompass several thousand source artifacts such as classes and methods. These static artifacts are presented by IDEs in views such as trees or source editors. To gain an understanding of a system, developers have to open many such views, which leads to a workspace cluttered with different windows or tabs. Navigating through the code or maintaining a working context is thus difficult for developers working on large software systems. In this dissertation we address the question how to augment IDEs with dynamic information to better navigate scattered code while at the same time not overwhelming developers with even more information in the IDE views. We claim that by first reducing the amount of information developers have to deal with, we are subsequently able to embed dynamic information in the familiar source perspectives of IDEs to better comprehend and navigate large software spaces. We propose means to reduce or mitigate the information by highlighting relevant source elements, by explicitly representing working context, and by automatically housekeeping the workspace in the IDE. We then improve navigation of scattered code by explicitly representing dynamic collaboration and software features in the static source perspectives of IDEs. We validate our claim by conducting empirical experiments with developers and by analyzing recorded development sessions

    Regression test selection for distributed Java RMI programs by means of formal concept analysis

    Get PDF
    Software maintenance is the process of modifying an existing system to ensure that it meets current and future requirements. As a result, performing regression testing becomes an essential but time consuming aspect of any maintenance activity. Regression testing is initiated after a programmer has made changes to a program that may have inadvertently introduced errors. It is a quality control approach to ensure that the newly modified code still complies with its specified requirements and that unmodified code has not been affected by the maintenance activity. In the literature various types of test selection techniques have been proposed to reduce the effort associated with re-executing the required test cases. However, the majority of these approach has been focusing only on sequential programs, and provide no or only very limited support for distributed programs or database-driven applications. The thesis presents a lightweight methodology, which applies Formal Concept Analysis to support a regression test selection analysis, in combination with execution trace collection and external data sharing analysis, for distributed Java RMI programs. Two Eclipse plug-ins were developed to automate the regression test selection process and to evaluate our methodology

    Customizable Feature based Design Pattern Recognition Integrating Multiple Techniques

    Get PDF
    Die Analyse und Rückgewinnung von Architekturinformationen aus existierenden Altsystemen ist eine komplexe, teure und zeitraubende Aufgabe, was der kontinuierlich steigenden Komplexität von Software und dem Aufkommen der modernen Technologien geschuldet ist. Die Wartung von Altsystemen wird immer stärker nachgefragt und muss dabei mit den neuesten Technologien und neuen Kundenanforderungen umgehen können. Die Wiederverwendung der Artefakte aus Altsystemen für neue Entwicklungen wird sehr bedeutsam und überlebenswichtig für die Softwarebranche. Die Architekturen von Altsystemen unterliegen konstanten Veränderungen, deren Projektdokumentation oft unvollständig, inkonsistent und veraltet ist. Diese Dokumente enthalten ungenügend Informationen über die innere Struktur der Systeme. Häufig liefert nur der Quellcode zuverlässige Informationen über die Struktur von Altsystemen. Das Extrahieren von Artefakten aus Quellcode von Altsystemen unterstützt das Programmverständnis, die Wartung, das Refactoring, das Reverse Engineering, die nachträgliche Dokumentation und Reengineering Methoden. Das Ziel dieser Dissertation ist es Entwurfsinformationen von Altsystemen zu extrahieren, mit Fokus auf die Wiedergewinnung von Architekturmustern. Architekturmuster sind Schlüsselelemente, um Architekturentscheidungen aus Quellcode von Altsystemen zu extrahieren. Die Verwendung von Mustern bei der Entwicklung von Applikationen wird allgemein als qualitätssteigernd betrachtet und reduziert Entwicklungszeit und kosten. In der Vergangenheit wurden unterschiedliche Methoden entwickelt, um Muster in Altsystemen zu erkennen. Diese Techniken erkennen Muster mit unterschiedlicher Genauigkeit, da ein und dasselbe Muster unterschiedlich spezifiziert und implementiert wird. Der Lösungsansatz dieser Dissertation basiert auf anpassbaren und wiederverwendbaren Merkmal-Typen, die statische und dynamische Parameter nutzen, um variable Muster zu definieren. Jeder Merkmal-Typ verwendet eine wählbare Suchtechnik (SQL Anfragen, Reguläre Ausdrücke oder Quellcode Parser), um ein bestimmtes Merkmal eines Musters im Quellcode zu identifizieren. Insbesondere zur Erkennung verschiedener Varianten eines Musters kommen im entwickelten Verfahren statische, dynamische und semantische Analysen zum Einsatz. Die Verwendung unterschiedlicher Suchtechniken erhöht die Genauigkeit der Mustererkennung bei verschiedenen Softwaresystemen. Zusätzlich wurde eine neue Semantik für Annotationen im Quellcode von existierenden Softwaresystemen entwickelt, welche die Effizienz der Mustererkennung steigert. Eine prototypische Implementierung des Ansatzes, genannt UDDPRT, wurde zur Erkennung verschiedener Muster in Softwaresystemenen unterschiedlicher Programmiersprachen (JAVA, C/C++, C#) verwendet. UDDPRT erlaubt die Anpassung der Mustererkennung durch den Benutzer. Alle Abfragen und deren Zusammenspiel sind konfigurierbar und erlauben dadurch die Erkennung von neuen und abgewandelten Mustern. Es wurden umfangreiche Experimente mit diversen Open Source Software Systemen durchgeführt und die erzielten Ergebnisse wurden mit denen anderer Ansätze verglichen. Dabei war es möglich eine deutliche Steigerung der Genauigkeit im entwickelten Verfahren gegenüber existierenden Ansätzen zu zeigen.Recovering design information from legacy applications is a complex, expensive, quiet challenging, and time consuming task due to ever increasing complexity of software and advent of modern technology. The growing demand for maintenance of legacy systems, which can cope with the latest technologies and new business requirements, the reuse of artifacts from the existing legacy applications for new developments become very important and vital for software industry. Due to constant evolution in architecture of legacy systems, they often have incomplete, inconsistent and obsolete documents which do not provide enough information about the structure of these systems. Mostly, source code is the only reliable source of information for recovering artifacts from legacy systems. Extraction of design artifacts from the source code of existing legacy systems supports program comprehension, maintenance, code refactoring, reverse engineering, redocumentation and reengineering methodologies. The objective of approach used in this thesis is to recover design information from legacy code with particular focus on the recovery of design patterns. Design patterns are key artifacts for recovering design decisions from the legacy source code. Patterns have been extensively tested in different applications and reusing them yield quality software with reduced cost and time frame. Different techniques, methodologies and tools are used to recover patterns from legacy applications in the past. Each technique recovers patterns with different precision and recall rates due to different specifications and implementations of same pattern. The approach used in this thesis is based on customizable and reusable feature types which use static and dynamic parameters to define variant pattern definitions. Each feature type allows user to switch/select between multiple searching techniques (SQL queries, Regular Expressions and Source Code Parsers) which are used to match features of patterns with source code artifacts. The technique focuses on detecting variants of different design patterns by using static, dynamic and semantic analysis techniques. The integrated use of SQL queries, source code parsers, regular expressions and annotations improve the precision and recall for pattern extraction from different legacy systems. The approach has introduced new semantics of annotations to be used in the source code of legacy applications, which reduce search space and time for detecting patterns. The prototypical implementation of approach, called UDDPRT is used to recognize different design patterns from the source code of multiple languages (Java, C/C++, C#). The prototype is flexible and customizable that novice user can change the SQL queries and regular expressions for detecting implementation variants of design patterns. The approach has improved significant precision and recall of pattern extraction by performing experiments on number of open source systems taken as baselines for comparisons
    corecore