Search CORE

159 research outputs found

Design Patterns:Supporting Design Process by Automatically Detecting Design Patterns and Giving Some Feedback

Author: van Doorn Ed
Publication venue
Publication date: 24/08/2016
Field of study

Open University of the Netherlands Research Portal

Design Patterns:Supporting Design Process by Automatically Detecting Design Patterns and Giving Some Feedback

Author: van Doorn Ed
Publication venue
Publication date: 24/08/2016
Field of study

Since the nineties, design patterns are of interest. Since the beginning of this century, research is done to the possibilities of recognizing design patterns in source code. There is little research to recognizing design patterns in UML diagrams and possibilities to provide feedback on those designs. This has led to the following research question: Are design patterns in a UML class diagram automatically detectable, and can one automatically supply some feedback? The research question is answered by a literature review followed by creating and testing a prototype, which recognizes design patterns and provides some feedback. In literature UML class diagrams and design patterns are modelled using matrices, decision trees, boolean expressions, Prolog clauses and 4-tuples. Dong describes a method, by which the various types of relationship can be denoted in one matrix. In his article, however, there is no description of the algorithm by which a design pattern can be detected. Unsuccessfully attempt has been made to write a detection algorithm by applying the steepest descent method and Richardson iteration to the method of Dong. Next, four methods for detecting design patterns are compared. The corresponding articles, including the article of Ba-Brahem, contain no description of an algorithm by which design patterns can be detected. The method of Ba-Brahem, which is based on the use of 4-tuples, seemed most promising. This method also offers the possibility to detect design patterns, which are only partially present. It can also be used to give feedback on missing relationships. For the method of Ba-Brahem I have designed and implemented an algorithm, which is able to detect design patterns. The prototype, which implements the method of Ba-Brahem, is able to detect 17 of the 23 Gang of Four patterns. Within one second 13 different design patterns are detected in a class diagram, which contains 57 classes en 61 relationships. In a second experiment a class diagram represented by a XMI files, which contains 33 classes and 49 associations, was used. The XMI file orginates from ArgoUML. This tool is used to design class diagrams. Within one second 17 different design patterns were detected. The prototype is also able to detect design patterns, which are partially present in a class diagram. When a class diagram includes a number of classes and interfaces, which forma design pattern, feedback is specified about the relationships between these classes and interfaces, which are not part of the design pattern. The prototype has made it clear, that instead of a 4-tuple a 3-tuple can be used. One of the attributes of a 4-tuple is intended to indicate whether a class has a self-reference. However, this attribute provides no contribution for the recognition of the Singleton design pattern. An essential characteristic is a self-reference of an object and not a self-reference of a class. The prototype is compared with four programs, which are described in literature. Only one program provides feedback, which is more comprehensive than the feedback from my prototype. Two programs have a graphical user interface, which displays the detected design patterns. How a design patterns is read, is either not disclosed or less simple compared to my prototype. For other issues, such as recall, precision and performance, the score of my prototype is at least equal. The Open University considers to make my prototype user friendly, such that it can be used for educational purposes

Open University of the Netherlands Research Portal

Static Detection of Design Patterns in Class Diagrams

Author: Bates Chris
Gamma Erich
Lee Hakjin
Tsantalis N.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Contains fulltext : 218659.pdf (Publisher’s version ) (Open Access)CSERC '1

Open University of the Netherlands Research Portal

Crossref

Radboud Repository

Recommended from our members

Uncovering Features in Behaviorally Similar Programs

Author: Su Fang-Hsiang
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2018
Field of study

The detection of similar code can support many so ware engineering tasks such as program understanding and program classification. Many excellent approaches have been proposed to detect programs having similar syntactic features. However, these approaches are unable to identify programs dynamically or statistically close to each other, which we call behaviorally similar programs. We believe the detection of behaviorally similar programs can enhance or even automate the tasks relevant to program classification. In this thesis, we will discuss our current approaches to identify programs having similar behavioral features in multiple perspectives. We first discuss how to detect programs having similar functionality. While the definition of a program’s functionality is undecidable, we use inputs and outputs (I/Os) of programs as the proxy of their functionality. We then use I/Os of programs as a behavioral feature to detect which programs are functionally similar: two programs are functionally similar if they share similar inputs and outputs. This approach has been studied and developed in the C language to detect functionally equivalent programs having equivalent I/Os. Nevertheless, some natural problems in Object Oriented languages, such as input generation and comparisons between application-specific data types, hinder the development of this approach. We propose a new technique, in-vivo detection, which uses existing and meaningful inputs to drive applications systematically and then applies a novel similarity model considering both inputs and outputs of programs, to detect functionally similar programs. We develop the tool, HitoshiIO, based on our in-vivo detection. In the subjects that we study, HitoshiIO correctly detect 68.4% of functionally similar programs, where its false positive rate is only 16.6%. In addition to functional I/Os of programs, we attempt to discover programs having similar execution behavior. Again, the execution behavior of a program can be undecidable, so we use instructions executed at run-time as a behavioral feature of a program. We create DyCLINK, which observes program executions and encodes them in dynamic instruction graphs. A vertex in a dynamic instruction graph is an instruction and an edge is a type of dependency between two instructions. The problem to detect which programs have similar executions can then be reduced to a problem of solving inexact graph isomorphism. We propose a link analysis based algorithm, LinkSub, which vectorizes each dynamic instruction graph by the importance of every instruction, to solve this graph isomorphism problem efficiently. In a K Nearest Neighbor (KNN) based program classification experiment, DyCLINK achieves 90 + % precision. Because HitoshiIO and DyCLINK both rely on dynamic analysis to expose program behavior, they have better capability to locate and search for behaviorally similar programs than traditional static analysis tools. However, they suffer from some common problems of dynamic analysis, such as input generation and run-time overhead. These problems may make our approaches challenging to scale. Thus, we create the system, Macneto, which integrates static analysis with machine topic modeling and deep learning to approximate program behaviors from their binaries without truly executing programs. In our deobfuscation experiments considering two commercial obfuscators that alter lexical information and syntax in programs, Macneto achieves 90 + % precision, where the groundtruth is that the behavior of a program before and after obfuscation should be the same. In this thesis, we offer a more extensive view of similar programs than the traditional definitions. While the traditional definitions of similar programs mostly use static features, such as syntax and lexical information, we propose to leverage the power of dynamic analysis and machine learning models to trace/collect behavioral features of pro- grams. These behavioral features of programs can then apply to detect behaviorally similar programs. We believe the techniques we invented in this thesis to detect behaviorally similar programs can improve the development of software engineering and security applications, such as code search and deobfuscation

Columbia University Academic Commons

Fouille de graphes pour le suivi d’objets dans les vidéos

Author: Diot Fabien
Publication venue: HAL CCSD
Publication date: 03/06/2014
Field of study

Detecting and following the main objects of a video is necessary to describe its content in order to, for example, allow for a relevant indexation of the multimedia content by the search engines. Current object tracking approaches either require the user to select the targets to follow, or rely on pre-trained classifiers to detect particular classes of objects such as pedestrians or car for example. Since those methods rely on user intervention or prior knowledge of the content to process, they cannot be applied automatically on amateur videos such as the ones found on YouTube. To solve this problem, we build upon the hypothesis that, in videos with a moving background, the main objects should appear more frequently than the background. Moreover, in a video, the topology of the visual elements composing an object is supposed consistent from one frame to another. We represent each image of the videos with plane graphs modeling their topology. Then, we search for substructures appearing frequently in the database of plane graphs thus created to represent each video. Our contributions cover both fields of graph mining and object tracking. In the first field, our first contribution is to present an efficient plane graph mining algorithm, named PLAGRAM. This algorithm exploits the planarity of the graphs and a new strategy to extend the patterns. The next contributions consist in the introduction of spatio-temporal constraints into the mining process to exploit the fact that, in a video, the motion of objects is small from on frame to another. Thus, we constrain the occurrences of a same pattern to be close in space and time by limiting the number of frames and the spatial distance separating them. We present two new algorithms, DYPLAGRAM which makes use of the temporal constraint to limit the number of extracted patterns, and DYPLAGRAM_ST which efficiently mines frequent spatio-temporal patterns from the datasets representing the videos. In the field of object tracking, our contributions consist in two approaches using the spatio-temporal patterns to track the main objects in videos. The first one is based on a search of the shortest path in a graph connecting the spatio-temporal patterns, while the second one uses a clustering approach to regroup them in order to follow the objects for a longer period of time. We also present two industrial applications of our methodDétecter et suivre les objets principaux d’une vidéo est une étape nécessaire en vue d’en décrire le contenu pour, par exemple, permettre une indexation judicieuse des données multimédia par les moteurs de recherche. Les techniques de suivi d’objets actuelles souffrent de défauts majeurs. En effet, soit elles nécessitent que l’utilisateur désigne la cible a suivre, soit il est nécessaire d’utiliser un classifieur pré-entraîné à reconnaitre une classe spécifique d’objets, comme des humains ou des voitures. Puisque ces méthodes requièrent l’intervention de l’utilisateur ou une connaissance a priori du contenu traité, elles ne sont pas suffisamment génériques pour être appliquées aux vidéos amateurs telles qu’on peut en trouver sur YouTube. Pour résoudre ce problème, nous partons de l’hypothèse que, dans le cas de vidéos dont l’arrière-plan n’est pas fixe, celui-ci apparait moins souvent que les objets intéressants. De plus, dans une vidéo, la topologie des différents éléments visuels composant un objet est supposée consistante d’une image a l’autre. Nous représentons chaque image par un graphe plan modélisant sa topologie. Ensuite, nous recherchons des motifs apparaissant fréquemment dans la base de données de graphes plans ainsi créée pour représenter chaque vidéo. Cette approche nous permet de détecter et suivre les objets principaux d’une vidéo de manière non supervisée en nous basant uniquement sur la fréquence des motifs. Nos contributions sont donc réparties entre les domaines de la fouille de graphes et du suivi d’objets. Dans le premier domaine, notre première contribution est de présenter un algorithme de fouille de graphes plans efficace, appelé PLAGRAM. Cet algorithme exploite la planarité des graphes et une nouvelle stratégie d’extension des motifs. Nous introduisons ensuite des contraintes spatio-temporelles au processus de fouille afin d’exploiter le fait que, dans une vidéo, les objets se déplacent peu d’une image a l’autre. Ainsi, nous contraignons les occurrences d’un même motif a être proches dans l’espace et dans le temps en limitant le nombre d’images et la distance spatiale les séparant. Nous présentons deux nouveaux algorithmes, DYPLAGRAM qui utilise la contrainte temporelle pour limiter le nombre de motifs extraits, et DYPLAGRAM_ST qui extrait efficacement des motifs spatio-temporels fréquents depuis les bases de données représentant les vidéos. Dans le domaine du suivi d’objets, nos contributions consistent en deux approches utilisant les motifs spatio-temporels pour suivre les objets principaux dans les vidéos. La première est basée sur une recherche du chemin de poids minimum dans un graphe connectant les motifs spatio-temporels tandis que l’autre est basée sur une méthode de clustering permettant de regrouper les motifs pour suivre les objets plus longtemps. Nous présentons aussi deux applications industrielles de notre méthod

Thèses en Ligne

HAL-UJM

Theses.fr

Graph-based Pattern Matching and Discovery for Process-centric Service Architecture Design and Integration

Author: Gacitua-Decar Veronica
Publication venue: Dublin City University. School of Computing
Publication date: 07/09/2010
Field of study

Process automation and applications integration initiatives are often complex and involve significant resources in large organisations. The increasing adoption of service-based architectures to solve integration problems and the widely accepted practice of utilising patterns as a medium to reuse design knowledge motivated the definition of this work. In this work a pattern-based framework and techniques providing automation and structure to address the process and application integration problem are proposed. The framework is a layered architecture providing modelling and traceability support to different abstraction layers of the integration problem. To define new services - building blocks of the integration solution - the framework includes techniques to identify process patterns in concrete process models. Graphs and graph morphisms provide a formal basis to represent patterns and their relation to models. A family of graph-based algorithms support automation during matching and discovery of patterns in layered process service models. The framework and techniques are demonstrated in a case study. The algorithms implementing the pattern matching and discovery techniques are investigated through a set of experiments from an empirical evaluation. Observations from conducted interviews to practitioners provide suggestions to enhance the proposed techniques and direct future work regarding analysis tasks in process integration initiatives

Irish Universities

DCU Online Research Access Service

Semantics-aware image understanding

Author: PASINI ANDREA
Publication venue: country:Italy
Publication date: 19/10/2021
Field of study

L'abstract è presente nell'allegato / the abstract is in the attachmen

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Formal Verification, Quantitative Analysis and Automated Detection of Design Patterns

Author: Pradhan Prayasee
Publication venue
Publication date: 31/05/2015
Field of study

Present-day software engineering concepts emphasize on developing software based on design patterns. Design patterns form the basis of generic solution to a recurring design problem. Software requirement analysis and design methodologies based on different Unified Modelling Language (UML) diagrams need to be strengthened by the use of a number of design patterns. In this study, an attempt has been made for automated verification of the design patterns. A grammar has been developed for verification and recognition of selected design patterns. ANTLR (ANother Tool for Language Recognition) tool has been used for verification of developed grammar. After proper verification and validation of design patterns, there comes a need to quantitatively determine the quality of design patterns. Hence, we have provided a methodology to compare the quality attributes of a system having design pattern solution with a system having non-pattern solution, both the system intending to provide same functionalities. Using Quality Model for Object-Oriented Design (QMOOD) approach, the cut-off points are calculated in order to provide the exact size of the system in terms of the number of classes, for which the solution adopted using design pattern, provides more quality parameters. Again Design Pattern Detection (DPD) has also considered as an emerging field of Software Reverse Engineering. An attempt has been made to present a noble approach for design pattern detection with the help of Graph Isomorphism and Normalized Cross Correlation (NCC) techniques. Eclipse Plugin i.e., ObjectAid is used to extract UML class diagrams as well as the eXtensible Markup Language (XML) files from the Software System and Design Pattern. An algorithm is proposed to extract relevant information from the XML files, and Graph Isomorphism technique is used to find the pattern subgraph. Use of NCC provides the percentage existence of the pattern in the system

ethesis@nitr

Approximate Graph Matching for Software Engineering

Author: Kpodjedo Hinnoutondji
Publication venue
Publication date: 01/08/2011
Field of study

RÉSUMÉ La représentation en graphes est l’une des plus communément utilisées pour la modélisation de toutes sortes d’objets ou problèmes. Un graphe peut être brièvement présenté comme un ensemble de nœuds reliés entre eux par des relations appelées arêtes ou arcs. La comparaison d’objets, représentés sous forme de graphes, est un problème important dans de nombreuses applications et une manière de traiter cette question consiste à recourir à l’appariement de graphes. Le travail présenté dans cette thèse s’intéresse aux techniques d’appariement approché de graphes et à leur application dans des activités de génie logiciel, notamment celles ayant trait à l’évolution d’un logiciel. Le travail de recherche effectué comporte trois principaux aspects distincts mais liés. Premièrement, nous avons considéré les problèmes d’appariement de graphes sous la formulation ETGM (Error-Tolerant Graph Matching : Appariement de Graphes avec Tolérance d’Erreurs) et proposé un algorithme performant pour leur résolution. En second, nous avons traité l’appariement d’artefacts logiciels, tels que les diagrammes de classe, en les formulant en tant que problèmes d’appariement de graphes. Enfin, grâce aux solutions obtenues sur les diagrammes de classes, nous avons proposé des mesures d’évolution et évalué leur utilité pour la prédiction de défauts. Les paragraphes suivants détaillent chacun des trois aspects ci-dessus mentionnés. Appariement approché de graphes Plusieurs problèmes pratiques peuvent être formulés en tant que problèmes d’Appariement Approché de Graphes (AAG) dans lesquels le but est de trouver un bon appariement entre deux graphes. Malheureusement, la littérature existante ne propose pas de techniques génériques et efficaces prêtes à être utilisées dans les domaines de recherche autres que le traitement d’images ou la biochimie. Pour tenter de remédier à cette situation, nous avons abordé les problèmes AAG de manière générique. Nous avons ainsi d’abord sélectionné une formulation capable de modéliser la plupart des différents problèmes AAG (la formulation ETGM). Les problèmes AAG sont des problèmes d’optimisation combinatoire reconnus comme étant (NP-)difficiles et pour lesquels la garantie de solutions optimales requiert des temps prohibitifs. Les méta-heuristiques sont un recours fréquent pour ce genre de problèmes car elles permettent souvent d’obtenir d’excellentes solutions en des temps raisonnables. Nous avons sélectionné la recherche taboue qui est une technique avancée de recherche locale permettant de construire et modifier graduellement et efficacement une solution à un problème donné. Nos expériences préliminaires nous ont révélé qu’il était suffisant d’initialiser une recherche locale avec un sous-ensemble très réduit (2 à 5%) d’une solution (quasi-)optimale pour se garantir d’excellents résultats. Étant donné que dans la plupart des cas, cette information n’est pas disponible, nous avons recouru à l’investigation de mesures de similarités pouvant nous permettre de prédire quels sont les appariements de nœuds les plus prometteurs. Notre approche a consisté à analyser les voisinages des nœuds de chaque graphe pour associer à chaque possible appariement de nœud une valeur de similarité indiquant les chances de retrouver la paire de nœuds en question dans une solution optimale. Nous avons ainsi exploré plusieurs possibilités et découvert que celle qui fonctionnait le mieux utilisait l’estimation la plus conservatrice, celle où la notion de voisinage similaire est la plus « restrictive ». De plus, pour attacher un niveau de confiance à cette mesure de similarité, nous avons appliqué un facteur correcteur tenant compte notamment des alternatives possibles pour chaque paire de nœuds. La mesure de similarité ainsi obtenue est alors utilisée pour imposer une direction en début de recherche locale. L’algorithme qui en résulte, SIM-T a été comparé à différents algorithmes récents (et représentant l’état de l’art) et les résultats obtenus démontrent qu’il est plus efficace et beaucoup plus rapide pour l’appariement de graphes qui partagent une majorité d’éléments identiques. Appariement approché de diagrammes en génie logiciel. Compte tenu de la taille et de la complexité des systèmes orientés-objet, retrouver et comprendre l’évolution de leur conception architecturale est une tache difficile qui requiert des techniques appropriées. Diverses approches ont été proposées mais elles se concentrent généralement sur un problème particulier et ne sont en général pas adaptées à d’autres problèmes, pourtant conceptuellement proches. Sur la base du travail réalisé pour les graphes, nous avons proposé MADMatch, un algorithme (plusieurs-à-plusieurs) d’appariement approché de diagrammes. Dans notre approche, les diagrammes architecturaux ou comportementaux qu’on peut retrouver en génie logiciel sont représentés sous forme de graphes orientés dont les nœuds (appelées entités) et arcs possèdent des attributs. Dans notre formulation ETGM, les différences entre deux diagrammes sont comprises comme étant le résultat d’opérations d’édition (telles que la modification, le renommage ou la fusion d’entités) auxquelles sont assignées des coûts. L’une des principales différences des diagrammes traités, par rapport aux graphes de la première partie, réside dans la présence d’une riche information textuelle. MADMatch se distingue par son intégration de cette information et propose plusieurs concepts qui en tirent parti. En particulier, le découpage en mots et la combinaison des termes obtenus avec la topologie des diagrammes permettent de définir des contextes lexicaux pour chaque entité. Les contextes ainsi obtenus sont ultérieurement utilisés pour filtrer les appariements improbables et permettre ainsi des réductions importantes de l’espace de recherche. A travers plusieurs cas d’étude impliquant différents types de diagrammes (tels que les diagrammes de classe, de séquence ou les systèmes à transition) et plusieurs techniques concurrentes, nous avons démontré que notre algorithme peut s’adapter à plusieurs problèmes d’appariement et fait mieux que les précédentes techniques, quant à la précision et le passage à l’échelle. Des métriques d’évolution pour la prédiction de défauts. Les tests logiciels constituent la pratique la plus répandue pour garantir un niveau raisonnable de qualité des logiciels. Cependant, cette activité est souvent un compromis entre les ressources disponibles et la qualité logicielle recherchée. En développement Orienté-Objet (OO), l’effort de tests devrait se concentrer sur les classes susceptibles de contenir des défauts. Cependant, l’identification de ces classes est une tâche ardue pour laquelle ont été utilisées différentes métriques, techniques et modèles avec un succès mitigé. Grâce aux informations d’évolution obtenues par l’application de notre technique d’appariement de diagrammes, nous avons défini des mesures élémentaires d’évolution relatives aux classes d’un système OO. Nos mesures de changement sont définies au niveau des diagrammes de classes et incluent notamment les nombres d’attributs, de méthodes ou de relations ajoutés, supprimés ou modifiés de version en version. Elles ont été utilisées en tant que variables indépendantes dans des modèles de prédiction de défauts visant à recommander les classes les plus susceptibles de contenir des défauts. Les métriques proposées ont été évaluées selon trois critères (variables dépendantes des différents modèles) : la simple présence (oui/non) de défauts, le nombre de défauts et la densité de défauts (relativement au nombre de Lignes de Code). La principale conclusion de nos expériences est que nos mesures d’évolution prédisent mieux, et de façon significative, la densité de défauts que des métriques connues (notamment de complexité). Ceci indique qu’elles pourraient aider à réduire l’effort de tests en concentrant les activités e tests sur des volumes plus réduits de code. ----------ABSTRACT Graph representations are among the most common and effective ways to model all kinds of natural or human-made objects. Once two objects or problems have been represented as graphs (i.e. as collections of objects possibly connected by pairwise relations), their comparison is a fundamental question in many different applications and is referred to as graph matching. The work presented in this document investigates approximate graph matching techniques and their application in software engineering as efficient ways to retrieve the evolution through time of software artifacts. The research work we carried involves three distinct but related aspects. First, we consider approximate graph matching problems within the Error-Tolerant Graph Matching framework, and propose a tabu search technique initialised with local structural similarity measures. Second, we address the matching of software artifacts, such as class diagrams, and propose new concepts able to integrate efficiently the rich lexical information to our tabu search. Third, based on matchings obtained from the application of our approach to subsequent class diagrams, we proposed new design evolution metrics and assessed their usefulness in defect prediction models. The following paragraphs detail each of those three aspects. Approximate Graph Matching Many practical problems can be modeled as approximate graph matching (AGM) problems in which the goal is to find a ”good” matching between two objects represented as graphs. Unfortunately, existing literature on AGM do not propose generic techniques readily usable in research areas other than image processing and biochemistry. To address this situation, we tackled in a generic way, the AGM problems. For this purpose, we first select, out of the possible formulations, the Error Tolerant Graph Matching (ETGM) framework that is able to model most AGM formulations. Given that AGM problems are generally NP-hard, we based our resolution approach on meta-heuristics, given the demonstrated efficiency of this family of techniques on (NP-)hard problems. Our approach avoids as much as possible assumptions about graphs to be matched and tries to make the best out of basic graph features such as node connectivity and edge types. Consequently, the proposal is a local search technique using new node similarity measures derived from simple structural information. The proposed technique was devised as follows. First, we observed and empirically validated that initializing a local search with a very small subset of ”correct” node matches is enough to get excellent results. Instead of directly trying to correctly match all nodes and edges of a given graph to the nodes and edges of another graph, one could focus on correctly matching a reduced subset of nodes. Second, in order to retrieve such subsets, we resorted to the concept of local node similarity. Our approach consists in assessing, by analyzing their neighborhoods, how likely it is to have a pair of nodes included in a good matching. We investigated many ways of computing similarity values between pairs of nodes and proposed additional techniques to attach a level of confidence to computed similarity value. Our work results in a similarity enhanced tabu algorithm (Sim-T) which is demonstrated to be more accurate and efficient than known state-of-the-art algorithms. Approximate Diagram Matching in software engineering Given the size and complexity of OO systems, retrieving and understanding the history of the design evolution is a difficult task which requires appropriate techniques. Building on the work done for generic AGM problems, we propose MADMatch, a Many-to-many Approximate Diagram Matching algorithm based on an ETGM formulation. In our approach, design representations are modeled as attributed directed multi-graphs. Transformations such as modifying, renaming, or merging entities in a software diagram are explicitly taken into account through edit operations to which specific costs can be assigned. MADMatch fully integrates the textual information available on diagrams and proposes several concepts enabling accurate and fast computation of matchings. We notably integrate to our proposal the use of “termal footprints” which capture the lexical context of any given entity and are exploited in order to reduce the search space of our tabu search. Through several case studies involving different types of diagrams (such as class diagrams, sequence diagrams and labeled transition systems), we show that our algorithm is generic and advances the state of art with respect to scalability and accuracy. Design Evolution Metrics for Defect Prediction Testing is the most widely adopted practice to ensure software quality. However, this activity is often a compromise between the available resources and sought software quality. In object-oriented development, testing effort should be focused on defect-prone classes or alternatively on classes deemed critical based on criteria such as their connectivity or evolution profile. Unfortunately, the identification of defect-prone classes is a challenging and difficult activity on which many metrics, techniques, and models have been tried with mixed success. Following the retrieval of class diagrams’ evolution by our graph matching approach, we proposed and investigated the usefulness of elementary design evolution metrics in the identification of defective classes. The metrics include the numbers of added, deleted, and modified attributes, methods, and relations. They are used to recommend a ranked list of classes likely to contain defects for a system. We evaluated the efficiency of our approach according to three criteria: presence of defects, number of defects, and defect density in the top-ranked classes. We conducted experiments with small to large systems and made comparisons against well known complexity and OO metrics. Results show that the design evolution metrics, when used in conjunction with known metrics, improve the identification of defective classes. In addition, they provide evidence that design evolution metrics make significantly better predictions of defect density than other metrics and, thus, can help in reducing the testing effort by focusing test activity on a reduced volume of code

PolyPublie