10 research outputs found

    Exploring the Relation Between Co-changes and Architectural Smells

    Get PDF
    The interplay between Maintainability and Reliability can be particularly complex and different kinds of trade-offs may arise when developers try to optimise for either one of these two qualities. To further understand how Maintainability and Reliability influence each other, we perform an empirical study using architectural smells and source code file co-changes as proxies for these two qualities, respectively. The study is designed using an exploratory multiple-case case study following well-know guidelines and using fourteen open source Java projects. Three different research questions are identified and investigated through statistical analysis. Co-changes are detected by using both a state-of-the-art algorithm and a novel approach. The three architectural smells selected are among the most important from the literature and are detected using open source tools. The results show that 50% of co-changes eventually end up taking part in an architectural smell. Moreover, statistical tests indicate that in 50% of the projects, files and packages taking part in smells are more likely to co-change than non-smelly files. Finally, co-changes were also found to appear before smells 90% of the times a smell and a co-change appear in the same file pair. Our findings show that Reliability is indirectly affected by low levels of Maintainability even at the architectural level. This is because low-quality components require more frequent changes by the developers, increasing chances to eventually introduce faults

    A framework for semi-automated software evolution analysis composition

    Get PDF
    Software evolution data stored in repositories such as version control, bug and issue tracking, or mailing lists is crucial to better understand a software system and assess its quality. A myriad of analyses exploiting such data have been proposed throughout the years. However, easy and straight forward synergies between these analyses rarely exist. To tackle this problem we have investigated the concept of Software Analysis as a Service and devised SOFAS, a distributed and collaborative software evolution analysis platform. Software analyses are offered as services that can be accessed, composed into workflows, and executed over the Internet. This paper presents our framework for composing these analyses into workflows, consisting of a custom-made modeling language and a composition infrastructure for the service offerings. The framework exploits the RESTful nature of our analysis service architecture and comes with a service composer to enable semi-automated service compositions by a user. We validate our framework by showcasing two different approaches built on top of it that support different stakeholders in gaining a deeper insight into a project history and evolution. As a result, our framework has shown its applicability to deliver diverse, complex analyses across system and tool boundarie

    On the Stability of Software Clones: A Genealogy-Based Empirical Study

    Get PDF
    Clones are a matter of great concern to the software engineering community because of their dual but contradictory impact on software maintenance. While there is strong empirical evidence of the harmful impact of clones on maintenance, a number of studies have also identified positive sides of code cloning during maintenance. Recently, to help determine if clones are beneficial or not during software maintenance, software researchers have been conducting studies that measure source code stability (the likelihood that code will be modified) of cloned code compared to non-cloned code. If the presence of clones in program artifacts (files, classes, methods, variables) causes the artifacts to be more frequently changed (i.e., cloned code is more unstable than non-cloned code), clones are considered harmful. Unfortunately, existing stability studies have resulted in contradictory results and even now there is no concrete answer to the research question "Is cloned or non-cloned code more stable during software maintenance?" The possible reasons behind the contradictory results of the existing studies are that they were conducted on different sets of subject systems with different experimental setups involving different clone detection tools investigating different stability metrics. Also, there are four major types of clones (Type 1: exact; Type 2: syntactically similar; Type 3: with some added, deleted or modified lines; and, Type 4: semantically similar) and none of these studies compared the instability of different types of clones. Focusing on these issues we perform an empirical study implementing seven methodologies that calculate eight stability-related metrics on the same experimental setup to compare the instability of cloned and non-cloned code in the maintenance phase. We investigated the instability of three major types of clones (Type 1, Type 2, and Type 3) from different dimensions. We excluded Type 4 clones from our investigation, because the existing clone detection tools cannot detect Type 4 clones well. According to our in-depth investigation on hundreds of revisions of 16 subject systems covering four different programming languages (Java, C, C#, and Python) using two clone detection tools (NiCad and CCFinder) we found that clones generally exhibit higher instability in the maintenance phase compared to non-cloned code. Specifically, Type 1 and Type 3 clones are more unstable as well as more harmful compared to Type 2 clones. However, although clones are generally more unstable sometimes they exhibit higher stability than non-cloned code. We further investigated the effect of clones on another important aspect of stability: method co-changeability (the degree methods change together). Intuitively, higher method co-changeability is an indication of higher instability of software systems. We found that clones do not have any negative effect on method co-changeability; rather, cloning can be a possible way of minimizing method co-changeability when clones are likely to evolve independently. Thus, clones have both positive and negative effects on software stability. Our empirical studies demonstrate how we can effectively use the positive sides of clones by minimizing their negative impacts

    Analyzing Clone Evolution for Identifying the Important Clones for Management

    Get PDF
    Code clones (identical or similar code fragments in a code-base) have dual but contradictory impacts (i.e., both positive and negative impacts) on the evolution and maintenance of a software system. Because of the negative impacts (such as high change-proneness, bug-proneness, and unintentional inconsistencies), software researchers consider code clones to be the number one bad-smell in a code-base. Existing studies on clone management suggest managing code clones through refactoring and tracking. However, a software system's code-base may contain a huge number of code clones, and it is impractical to consider all these clones for refactoring or tracking. In these circumstances, it is essential to identify code clones that can be considered particularly important for refactoring and tracking. However, no existing study has investigated this matter. We conduct our research emphasizing this matter, and perform five studies on identifying important clones by analyzing clone evolution history. In our first study we detect evolutionary coupling of code clones by automatically investigating clone evolution history from thousands of commits of software systems downloaded from on-line SVN repositories. By analyzing evolutionary coupling of code clones we identify a particular clone change pattern, Similarity Preserving Change Pattern (SPCP), such that code clones that evolve following this pattern should be considered important for refactoring. We call these important clones the SPCP clones. We rank SPCP clones considering their strength of evolutionary coupling. In our second study we further analyze evolutionary coupling of code clones with an aim to assist clone tracking. The purpose of clone tracking is to identify the co-change (i.e. changing together) candidates of code clones to ensure consistency of changes in the code-base. Our research in the second study identifies and ranks the important co-change candidates by analyzing their evolutionary coupling. In our third study we perform a deeper analysis on the SPCP clones and identify their cross-boundary evolutionary couplings. On the basis of such couplings we separate the SPCP clones into two disjoint subsets. While one subset contains the non-cross-boundary SPCP clones which can be considered important for refactoring, the other subset contains the cross-boundary SPCP clones which should be considered important for tracking. In our fourth study we analyze the bug-proneness of different types of SPCP clones in order to identify which type(s) of code clones have high tendencies of experiencing bug-fixes. Such clone-types can be given high priorities for management (refactoring or tracking). In our last study we analyze and compare the late propagation tendencies of different types of code clones. Late propagation is commonly regarded as a harmful clone evolution pattern. Findings from our last study can help us prioritize clone-types for management on the basis of their tendencies of experiencing late propagations. We also find that late propagation can be considerably minimized by managing the SPCP clones. On the basis of our studies we develop an automatic system called AMIC (Automatic Mining of Important Clones) that identifies the important clones for management (refactoring and tracking) and ranks these clones considering their evolutionary coupling, bug-proneness, and late propagation tendencies. We believe that our research findings have the potential to assist clone management by pin-pointing the important clones to be managed, and thus, considerably minimizing clone management effort

    Análise da evolução de software com séries temporais

    Get PDF
    Trabalho apresentado no âmbito do Mestrado em Engenharia Informática, como requisito parcial para obtenção do grau de Mestre em Engenharia InformáticaUm sistema de software nunca está terminado. Mesmo depois de ter sido entregue, o software continua a evoluir. Por esta razão,governos, empresas, e comunidades open source gastam muitos recursos regularmente para corrigir, adaptar, ou melhorar, os seus sistemas de software. Alguns estudos referem que cerca de 90% dos recursos das empresas dedicados ao software são gastos em actividades de manutenção. Isso implica que apenas 10% é dedicado para outras actividades, entre as quais o desenvolvimento de novos projectos. Isto representa uma oportunidade para, com um melhor planeamento, se tornar o processo de software mais eficiente, com importantes ganhos económicos que daí resultam. É por isso que a capacidade de desenvolver software de uma forma rápida e fiável é um grande desafio na Engenharia de Software. Uma possível técnica para ajudar a reduzir custos e produzir um software de qualidade é, a previsão do seu comportamento no futuro. Para os gestores de projecto e programadores, prever a evolução do software será de grande utilidade, pois permitirá direccionar esforços para partes que necessitem uma maior intervenção. Para a previsão ser possível, é necessário analisar a história da vida do software, que está normalmente guardada nos repositórios de dados dos projectos. Pretendemos por isso, efectuar uma análise da evolução do IDE Eclipse, usando séries temporais. Esta análise permitirá visualizar a evolução do número de defeitos do Eclipse ao longo do tempo e sua previsão no futuro. Serão usados dados do sistema de rastreio de defeitos. Isto permitirá identificar possíveis padrões e tendências na distribuição do número de defeitos, permitindo a criação de um modelo fiável de previsão. O resultado desta dissertação constituirá mais um caso de estudo da evolução de um sistema de sucesso, duradouro e bastante usado, que diverge nos objectivos de trabalhos anteriores sobre o Eclipse, mas que com outro estudo é um reforço da utilização da análise de séries temporais, uma técnica insuficientemente explorada, no contexto do estudo da evolução de software, particularmente na previsão dessa evolução

    Visualisations novatrices pour la compréhension de réseaux et de logiciels complexes

    Get PDF
    La visualisation d’information a le potentiel de pouvoir exploiter nos capacités visuelles, acquises au fil de centaines de millions d’années d’évolution, afin de faciliter la découverte de secrets enfouis dans les données, de nouveaux patrons ou de relations insoupçonnées. Il existe toutefois une grande variété de données, plus ou moins structurées, que l’on cherche à comprendre sous diverses perspectives. En particulier, les données sous forme de réseaux servent à modéliser des phénomènes importants, tels que les communautés sociales ou les transactions financières, mais peuvent être difficiles à représenter si les réseaux sont grands, hiérarchiques, et/ou dynamiques. Cette thèse se concentre sur la conception de nouvelles techniques de visualisation de réseaux, dans le but de faciliter la compréhension de données. Les techniques de visualisation présentes dans la littérature sont utiles dans certains contextes et comportent chacune des limitations. Néanmoins, il existe encore des possibilités inexplorées pour créer des nouvelles façons de représenter des données. La validation de ces nouvelles techniques demeure un défi. En outre, les interfaces doivent être simples à utiliser, mais aussi faciliter l’analyse et l’exploration de données. Dans le but d’étudier de nouvelles options de visualisations pour faciliter des tâches de compréhension des données, nous avons d’abord classifié les travaux antérieurs avec des taxonomies. De cette manière, nous avons aussi pu mettre en lumière des nouvelles pistes d’hybrides (c’est-à-dire, des combinaisons d’approches) potentiellement intéressantes pour visualiser des réseaux statiques et dynamiques. Les contributions présentées dans cette thèse couvrent différents aspects de la visualisation de réseaux complexes et dynamiques. D’abord, le premier chapitre se concentre sur la visualisation de réseaux statiques comportant des hiérarchies, par la combinaison d’approches. Le prototype décrit dans le deuxième chapitre permet également de combiner des représentations visuelles, mais peut être aussi utilisé afin de modéliser des graphes dynamiques. Enfin, le troisième chapitre présente une nouvelle méthode visuelle appliquée afin de tracer l’évolution de structures de conception complexes dans des logiciels (modélisés par des réseaux). Ainsi, dans le premier prototype (TreeMatrix), des parties de graphes sont montrées avec des matrices et des diagrammes noeuds-liens, alors que les arborescences sont représentées par des diagrammes en glaçons et des regroupements. Contrairement aux autres visualisations dans la littérature, cette nouvelle technique aide à montrer des réseaux denses, sans nuire à la compréhension des liens à plus haut niveau. Une expérience avec des utilisateurs a montré certains avantages afin de découvrir et organiser les liens de modules au sein d’un logiciel, en comparaison avec le logiciel commercial Lattix. Nous avons également combiné des approches de manière novatrice pour notre second prototype (DiffAni) afin de visualiser des réseaux qui évoluent dans le temps. DiffAni est le premier hybride interactif de graphes dynamiques et sa validation avec des participants a permis de faire ressortir certains avantages. Ainsi, l’utilisation d’animation doit être modérée et est surtout utile lors de mouvements significatifs. Ces résultats, avec nos taxonomies, pourraient contribuer à guider la création de nouveaux hybrides dans le futur. Le troisième prototype (IHVis) a facilité l’exploration et le traçage de structures de conception dans des logiciels en évolution (modélisés par des réseaux) à partir de répertoires de code source. Cette nouvelle visualisation a notamment révélé des cas d’introduction de points de stabilité et des refactorings, et certains participants ont aussi trouvé d’autres informations intéressantes, telles que l’extension de fonctionnalités par l’implémentation d’interfaces. En résumé, cette thèse présente des façons novatrices et utiles de visualiser des réseaux complexes et dynamiques. Nos principales contributions sont (1) l’exploration d’espaces de conception de nouvelles visualisations de réseaux à l’aide de taxonomies, (2) la conception de prototypes combinant des approches pour visualiser des réseaux hiérarchiques et dynamiques, (3) la conception d’une nouvelle méthode visuelle d’exploration des variations et des instabilités au sein de logiciels en évolution, (4) l’évaluation de ces techniques à l’aide d’expériences avec des participants
    corecore