11 research outputs found
Comprehension and change impact analysis of aspect-oriented programs through declarative reasoning
In this dissertation, we discuss an approach to support declarative reasoning over aspect-oriented programs, where the AspectJ programming language is deployed as a notable (and representative) technology. The approach is based on (i) the transformation of source code into a set of facts, and (ii) the definition and implementation of relationships and dependencies between different elements of the system into rules, stored in a Prolog database. Declarative analysis allows us to extract complex information through its rich and expressive mechanisms. Our approach has two contributions. First, it can improve the comprehension of AspectJ programs, and it can be deployed for any AspectJ-like language, like e.g. AspectC#, AspectC++. The second contribution is the provision of change impact analysis for AspectJ programs. Our method is automated and tool support is available. Expected beneficiaries of our approach include system maintainers performing tasks during the "change planning" stage of evolution
Automatic Detection and Classification of Identifier Renamings
RÉSUMÉ
Le lexique du code source joue un rôle primordial dans la maintenabilité des logiciels. Un lexique pauvre peut induire à une mauvaise compréhension du programme et à l'augmentation des erreurs du logiciel. Il est donc important que les développeurs maintiennent le lexique de leur code source en renommant les identifiants afin qu'ils reflètent les concepts qu'ils expriment. Dans cette thèse, nous étudions le lexique et proposons une approche pour détecter et classifier les renommages des identifiants dans le code source.
La détection des renommages est basée sur la combinaison de deux techniques: la différenciation des codes sources et l'analyse de flux de données. Tandis que le classificateur de renommage utilise une base de données ontologique et un analyseur syntaxique du langage naturel pour classer les renommages selon la taxonomie que nous avons défini. Afin d'évaluer l'exactitude et l'exhaustivité du détecteur de renommage, nous avons réalisé une étude empirique sur l’historique de cinq programmes Java open-source. Les résultats de cette étude rapportent une précision de 88% et un rappel 92%.
Nous avons également mené une étude exploratoire qui analyse et discute comment les identifiants sont renommés, selon la taxonomie proposée, dans les cinq programmes Java de l’étude précédente. Les résultats de cette étude exploratoire montrent qu’il existe des renommages dans chaque dimension de notre taxonomie.
Afin d’appliquer l’approche proposée aux programmes PHP, nous avons adapte notre détecteur de renommages pour prendre en compte les caractéristiques inhérentes à ces programmes. Une étude préliminaire effectuée sur trois programmes PHP montre que notre approche est applicable aux programmes PHP. Cependant, ces programmes ont des tendances de renommages différentes de celles observées dans les programmes Java.
Cette thèse propose deux résultats. Tout d'abord, la détection et la classification des renommages et un outil, qui peut être utilisé pour documenter les renommages. Les développeurs seront en mesure de, par exemple, rechercher des méthodes qui font partie de l’interface de programmation car celles-ci impactent les applications clientes. Ils pourront également identifier les incohérences entre le nom et la fonctionnalité d'une entité en cas de renommage dit risqué comme lors d’un renommage vers un antonyme. Deuxièmement, les résultats de nos études nous fournissent des leçons qui constituent une base de connaissance et de conseils pouvant aider les développeurs à éviter des renommages inappropriés ou inutiles et ainsi maintenir la cohérence du lexique de leur code source.----------ABSTRACT
Source code lexicon plays a paramount role in software maintainability: a poor lexicon can lead to poor comprehensibility and increase software fault-proneness. For this reason, developers should maintain their source code lexicon by renaming identifiers when they do not
reflect the concepts that they should express. In this thesis, we study lexicon and propose an approach to detect and classify identifier renamings in source code. The renaming detection is based on a combination of source code differencing and data flow analysis, while the renaming classifier uses an ontological database and a natural language parser to classify renamings according to a taxonomy we define. We report a study—conducted on the evolution history of five open-source Java programs—aimed at evaluating the accuracy and completeness of the renaming detector. The study reports a precision of 88% and a recall of 92%. In addition, we report an exploratory study investigating and discussing how identifiers are renamed in the five Java programs, according to our taxonomy. Moreover, we report the challenges and applicability of the proposed approach to PHP programs and report our preliminary results of renaming detection and classification for three programs. This thesis provides two outcomes. First, the renaming detection and classification approach and tool, which can be used for documenting renamings. Developers will be able to, for example, look up methods that are part of the public API (as they impact client applications), or look for inconsistencies between the name and the implementation of an entity that underwent a high risk renaming (e.g., towards the opposite meaning). Second, pieces of actionable knowledge, based on our qualitative study of renamings, that provide advice on how to avoid some unnecessary renamings
Restructuring source code identifiers
In software engineering, maintenance cost 60% of overall project lifecycle costs of any
software product. Program comprehension is a substantial part of maintenance and
evolution cost and, thus, any advancement in maintenance, evolution, and program understanding
will potentially greatly reduce the total cost of ownership of any software
products. Identifiers are an important source of information during program understanding
and maintenance. Programmers often use identifiers to build their mental
models of the software artifacts. Thus, poorly-chosen identifiers have been reported in
the literature as misleading and increasing the program comprehension effort. Identifiers are composed of terms, which can be dictionary words, acronyms, contractions, or
simple strings. We conjecture that the use of identical terms in different contexts may
increase the risk of faults, and hence maintenance effort. We investigate our conjecture
using a measure combining term entropy and term context-coverage to study whether
certain terms increase the odds ratios of methods to be fault-prone. We compute term
entropy and context-coverage of terms extracted from identifiers in Rhino 1.4R3 and
ArgoUML 0.16. We show statistically that methods containing terms with high entropy
and context-coverage are more fault-prone than others, and that the new measure is only
partially correlated with size. We will build on this study, and will apply summarization
technique for extracting linguistic information form methods and classes. Using this
information, we will extract domain concepts from source code, and propose linguistic
based refactoring
An empirical study on the relation between identifiers and fault proneness
Poorly-chosen identifiers have been reported
in the literature as misleading and increasing the program
comprehension effort. Identifiers are composed of terms,
which can be dictionary words, acronyms, contractions, or
simple strings. We conjecture that the use of identical terms
in different contexts may increase the risk of faults. We
investigate our conjecture using a measure combining term
entropy and term context-coverage to study whether certain
terms increase the odds ratios of methods to be fault-prone.
Entropy measures the physical dispersion of terms
in a program: the higher the entropy, the more scattered
across the program the terms. Context coverage measures
the conceptual dispersion of terms: the higher their context
coverage, the more unrelated the methods using them.
We compute term entropy and context-coverage of terms
extracted from identifiers in Rhino 1.4R3 and ArgoUML
0.16. We show statistically that methods containing terms
with high entropy and context-coverage are more fault-prone
than others
Special Issue of PPAP 2013: Preface
On the 5th of March, 2013, the first workshop on Patterns Promotion and Anti-patterns Prevention (PPAP 2013) took place in Genova, Italy. PPAP 2013 was co-located with the 17th European Conference on Software Maintenance and Reengineering (CSMR'2013), the premier European conference on the theory and practice of maintenance, reengineering and evolution of software systems. With the aim of promoting the application of patterns and prevent the spread of anti-patterns, the first objective of PPAP is to build a bridge between the different families of patterns and anti-patterns in software engineering