    Linguistic Anti-Patterns: Impact Analysis on Code Quality

    Les “mauvaises odeurs” de conception sont des structures qui indiquent une violation des principes fondamentaux de conception et qui nuisent à la qualité des systèmes logiciels. Ils représentent des choix d’architectures, de conception, et d’implémentation qui doivent être suivis et améliorés. Dans ce travail, on considère deux sous types de ces “mauvaises odeurs” qui sont les anti-patrons de conception (DAPs) et les anti-patrons linguistiques (LAs). Les anti-patrons de conception (DAPs) sont les patrons que les développeurs considèrent comme étant des bonnes solutions à certains problèmes mais qui ont en réalité un impact négatif sur la qualité des logiciels. Des études récentes ont démontré que les anti-patrons rendent la maintenance logicielle plus difficile dans les systèmes orientés objets ainsi qu’ils augmentent le changement et les défaillances. Le concept d’anti-patrons linguistiques (LAs) fait référence aux mauvaises pratiques de nommage, de documentation et de l’implémentation du code source qui peuvent négativement impacter la qualité des systèmes logiciels et la compréhension du programme. Contrairement aux anti-patrons, les patrons de conception (DPs) présentent une solution prometteuse qui sert à améliorer la qualité des systèmes orientés objets. Dans certains cas, les patrons de conception et contrairement à ce qui est connu, peuvent avoir aussi un impact négatif sur la qualité logicielle. Pour cela, nous considérons également les patrons de conception dans ce travail afin d’étudier leurs comportements et leurs qualité au cours de l’évolution des logiciels. Avoir une bonne qualité logicielle est primordial pour contrôler et réduire les coûts de la maintenance des systèmes orientés objets. Il est important de disposer de mécanismes permettant de mesurer la qualité logicielle. Cependant, la qualité a des différentes significations qui peuvent être par exemple la capacité d’un système à changer à faible coût ou même l’absence de bogues dans le logiciel. Dans cette thèse, nous considérons comme mesures indirectes de la qualité: la compréhension du code, la propension au changement, et la prévalance de fautes. Au cours de l’évolution d’un logiciel, les développeurs risquent d’introduire des anti-patrons durant leurs tâches de développement (fixer des bogues, ajouter des nouvelles fonctionnalités, ou même appliquer des nouvelles exigences). Dans cette thèse, nous avons étudié l’impact des anti-patrons de conception, les anti-patrons linguistiques, et les patrons de conception sur la qualité logicielle. ----------ABSTRACT: Design smells are bad practices in software design that lower the quality of software systems. They represent architectural, design, and implementation choices that should be tracked and removed. We consider design anti-patterns (DAPs) and linguistic anti-patterns (LAs) as two special types of design smells in our work, in contrast to design patterns (DPs). DAPs are software patterns that are thought by developers to be good solutions to some design problems but that have actually a negative impact on quality. Recent studies have brought evidence that DAPs make maintenance more difficult in object-oriented systems and increase change- and fault-proneness. LAs refer to bad practices of naming, documentation, and implementation of code entities, which could decrease the quality of software systems and have a negative impact on program comprehension. Opposite to design smells, DPs are promising solutions to improve the quality of object-oriented systems. Yet against popular wisdom, design patterns in practice can impact quality negatively. Achieving good quality is important to control and reduce the maintenance cost of object-oriented systems. This goal requires means to measure the quality of systems. However, quality has different meanings, e.g., the capacity of a system to change at low cost or the absence of bugs. In this thesis, we consider code understanding, change-proneness, and fault-proneness as three proxy measures for quality

    Evaluating Design Decay during Software Evolution

    Les logiciels sont en constante évolution, nécessitant une maintenance et un développement continus. Ils subissent des changements tout au long de leur vie, que ce soit pendant l'ajout de nouvelles fonctionnalités ou la correction de bogues dans le code. Lorsque ces logiciels évoluent, leurs architectures ont tendance à se dégrader avec le temps et deviennent moins adaptables aux nouvelles spécifications des utilisateurs. Elles deviennent plus complexes et plus difficiles à maintenir. Dans certains cas, les développeurs préfèrent refaire la conception de ces architectures à partir du zéro plutôt que de prolonger la durée de leurs vies, ce qui engendre une augmentation importante des coûts de développement et de maintenance. Par conséquent, les développeurs doivent comprendre les facteurs qui conduisent à la dégradation des architectures, pour prendre des mesures proactives qui facilitent les futurs changements et ralentissent leur dégradation. La dégradation des architectures se produit lorsque des développeurs qui ne comprennent pas la conception originale du logiciel apportent des changements au logiciel. D'une part, faire des changements sans comprendre leurs impacts peut conduire à l'introduction de bogues et à la retraite prématurée du logiciel. D'autre part, les développeurs qui manquent de connaissances et–ou d'expérience dans la résolution d'un problème de conception peuvent introduire des défauts de conception. Ces défauts ont pour conséquence de rendre les logiciels plus difficiles à maintenir et évoluer. Par conséquent, les développeurs ont besoin de mécanismes pour comprendre l'impact d'un changement sur le reste du logiciel et d'outils pour détecter les défauts de conception afin de les corriger. Dans le cadre de cette thèse, nous proposons trois principales contributions. La première contribution concerne l'évaluation de la dégradation des architectures logicielles. Cette évaluation consiste à utiliser une technique d’appariement de diagrammes, tels que les diagrammes de classes, pour identifier les changements structurels entre plusieurs versions d'une architecture logicielle. Cette étape nécessite l'identification des renommages de classes. Par conséquent, la première étape de notre approche consiste à identifier les renommages de classes durant l'évolution de l'architecture logicielle. Ensuite, la deuxième étape consiste à faire l'appariement de plusieurs versions d'une architecture pour identifier ses parties stables et celles qui sont en dégradation. Nous proposons des algorithmes de bit-vecteur et de clustering pour analyser la correspondance entre plusieurs versions d'une architecture. La troisième étape consiste à mesurer la dégradation de l'architecture durant l'évolution du logiciel. Nous proposons un ensemble de m´etriques sur les parties stables du logiciel, pour évaluer cette dégradation. La deuxième contribution est liée à l'analyse de l'impact des changements dans un logiciel. Dans ce contexte, nous présentons une nouvelle métaphore inspirée de la séismologie pour identifier l'impact des changements. Notre approche considère un changement à une classe comme un tremblement de terre qui se propage dans le logiciel à travers une longue chaîne de classes intermédiaires. Notre approche combine l'analyse de dépendances structurelles des classes et l'analyse de leur historique (les relations de co-changement) afin de mesurer l'ampleur de la propagation du changement dans le logiciel, i.e., comment un changement se propage à partir de la classe modifiée è d'autres classes du logiciel. La troisième contribution concerne la détection des défauts de conception. Nous proposons une métaphore inspirée du système immunitaire naturel. Comme toute créature vivante, la conception de systèmes est exposée aux maladies, qui sont des défauts de conception. Les approches de détection sont des mécanismes de défense pour les conception des systèmes. Un système immunitaire naturel peut détecter des pathogènes similaires avec une bonne précision. Cette bonne précision a inspiré une famille d'algorithmes de classification, appelés systèmes immunitaires artificiels (AIS), que nous utilisions pour détecter les défauts de conception. Les différentes contributions ont été évaluées sur des logiciels libres orientés objets et les résultats obtenus nous permettent de formuler les conclusions suivantes: • Les métriques Tunnel Triplets Metric (TTM) et Common Triplets Metric (CTM), fournissent aux développeurs de bons indices sur la dégradation de l'architecture. La d´ecroissance de TTM indique que la conception originale de l'architecture s’est dégradée. La stabilité de TTM indique la stabilité de la conception originale, ce qui signifie que le système est adapté aux nouvelles spécifications des utilisateurs. • La séismologie est une métaphore intéressante pour l'analyse de l'impact des changements. En effet, les changements se propagent dans les systèmes comme les tremblements de terre. L'impact d'un changement est plus important autour de la classe qui change et diminue progressivement avec la distance à cette classe. Notre approche aide les développeurs à identifier l'impact d'un changement. • Le système immunitaire est une métaphore intéressante pour la détection des défauts de conception. Les résultats des expériences ont montré que la précision et le rappel de notre approche sont comparables ou supérieurs à ceux des approches existantes.Software systems evolve, requiring continuous maintenance and development. They undergo changes throughout their lifetimes as new features are added and bugs are fixed. As these systems evolved, their designs tend to decay with time and become less adaptable to changing users'requirements. Consequently, software designs become more complex over time and harder to maintain; in some not-sorare cases, developers prefer redesigning from scratch rather than prolonging the life of existing designs, which causes development and maintenance costs to rise. Therefore, developers must understand the factors that drive the decay of their designs and take proactive steps that facilitate future changes and slow down decay. Design decay occurs when changes are made on a software system by developers who do not understand its original design. On the one hand, making software changes without understanding their effects may lead to the introduction of bugs and the premature retirement of the system. On the other hand, when developers lack knowledge and–or experience in solving a design problem, they may introduce design defects, which are conjectured to have a negative impact on the evolution of systems, which leads to design decay. Thus, developers need mechanisms to understand how a change to a system will impact the rest of the system and tools to detect design defects. In this dissertation, we propose three principal contributions. The first contribution aims to evaluate design decay. Measuring design decay consists of using a diagram matching technique to identify structural changes among versions of a design, such as a class diagram. Finding structural changes occurring in long-lived, evolving designs requires the identification of class renamings. Thus, the first step of our approach concerns the identification of class renamings in evolving designs. Then, the second step requires to match several versions of an evolving design to identify decaying and stable parts of the design. We propose bit-vector and incremental clustering algorithms to match several versions of an evolving design. The third step consists of measuring design decay. We propose a set of metrics to evaluate this design decay. The second contribution is related to change impact analysis. We present a new metaphor inspired from seismology to identify the change impact. In particular, our approach considers changes to a class as an earthquake that propagates through a long chain of intermediary classes. Our approach combines static dependencies between classes and historical co-change relations to measure the scope of change propagation in a system, i.e., how far a change propagation will proceed from a “changed class” to other classes. The third contribution concerns design defect detection. We propose a metaphor inspired from a natural immune system. Like any living creature, designs are subject to diseases, which are design defects. Detection approaches are defense mechanisms if designs. A natural immune system can detect similar pathogens with good precision. This good precision has inspired a family of classification algorithms, artificial Immune Systems (AIS) algorithms, which we use to detect design defects. The three contributions are evaluated on open-source object-oriented systems and the obtained results enable us to draw the following conclusions: • Design decay metrics, Tunnel Triplets Metric (TTM) and Common Triplets Metric (CTM), provide developers useful insights regarding design decay. If TTM decreases, then the original design decays. If TTM is stable, then the original design is stable, which means that the system is more adapted to the new changing requirements. • Seismology provides an interesting metaphor for change impact analysis. Changes propagate in systems, like earthquakes. The change impact is most severe near the changed class and drops off away from the changed class. Using external information, we show that our approach helps developers to locate easily the change impact. • Immune system provides an interesting metaphor for detecting design defects. The results of the experiments showed that the precision and recall of our approach are comparable or superior to that of previous approaches

    Software Fault Prediction and Test Data Generation Using Articial Intelligent Techniques

    The complexity in requirements of the present-day software, which are often very large in nature has lead to increase in more number of lines of code, resulting in more number of modules. There is every possibility that some of the modules may give rise to varieties of defects, if testing is not done meticulously. In practice, it is not possible to carry out white box testing of every module of any software. Thus, software testing needs to be done selectively for the modules, which are prone to faults. Identifying the probable fault-prone modules is a critical task, carried out for any software. This dissertation, emphasizes on design of prediction and classication models to detect fault prone classes for object-oriented programs. Then, test data are generated for a particular task to check the functionality of the software product. In the eld of object-oriented software engineering, it is observed that Chidamber and Kemerer (CK) software metrics suite is more frequently used for fault prediction analysis, as it covers the unique aspects of object - oriented programming such as the complexity, data abstraction, and inheritance. It is observed that one of the most important goals of fault prediction is to detect fault prone modules as early as possible in the software development life cycle (SDLC). Numerous authors have used design and code metrics for predicting fault-prone modules. In this work, design metrics are used for fault prediction. In order to carry out fault prediction analysis, prediction models are designed using machine learning methods. Machine learning methods such as Statistical methods, Articial neural network, Radial basis function network, Functional link articial neural network, and Probabilistic neural network are deployed for fault prediction analysis. In the rst phase, fault prediction is performed using the CK metrics suite. In the next phase, the reduced feature sets of CK metrics suite obtained by applying principal component analysis and rough set theory are used to perform fault prediction. A comparative approach is drawn to nd a suitable prediction model among the set of designed models for fault prediction. Prediction models designed for fault proneness, need to be validated for their eciency. To achieve this, a cost-based evaluation framework is designed to evaluate the eectiveness of the designed fault prediction models. This framework, is based on the classication of classes as faulty or not-faulty. In this cost-based analysis, it is observed that fault prediction is found to be suitable where normalized estimated fault removal cost (NEcost) is less than certain threshold value. Also this indicated that any prediction model having NEcost greater than the threshold value are not suitable for fault prediction, and then further these classes are unit tested. All the prediction and classier models used in the fault prediction analysis are applied on a case study viz., Apache Integration Framework (AIF). The metric data values are obtained from PROMISE repository and are mined using Chidamber and Kemerer Java Metrics (CKJM) tool. Test data are generated for object-oriented program for withdrawal task in Bank ATM using three meta-heuristic search algorithms such as Clonal selection algorithm, Binary particle swarm optimization, and Articial bee colony algorithm. It is observed that Articial bee colony algorithm is able to obtain near optimal test data when compared to the other two algorithms. The test data are generated for withdrawal task based on the tness function derived by using the branch distance proposed by Bogdan Korel. The generated test data ensure the proper functionality or the correctness of the programmed module in a software

    Associations Among Obesity-Related Guilt, Shame, and Coping

    Psychological factors proved to have significant influence on the outcome and success of the treatment of obesity, and there might be a psychological mechanism explaining why only a subgroup of the obese population suffers from being overweight. The main hypothesis of this work is that weight-related shame and guilt feelings are psychological factors crucial for both emotional well-being and the success of weight loss attempts. Prior studies found suggestive evidence that this hypothesis might be valid: Obese individuals are likely to experience weight-related shame feelings through the contrast of an overtly visible stigma and the omnipresent thin ideal in society. Weight-related guilt feelings are likely experienced since weight control is still viewed as a matter of willpower by obese as well as nonobese individuals, but unfortunately most weight loss attempts do not remain successful. Consequently, the three manuscripts address the following research questions: Are weight-and body-related shame and guilt concerning weight control separate constructs? Are weight-related shame and guilt feelings associated to BMI? Are shame-based or guilt-based coping responses predictive of weight change? Is it possible to minimize guilt and shame feelings about eating through a counseling approach emphasizing genetic factors in the development of obesity? The first manuscript presents the evaluation of the psychometric properties of a new self-report measure of weight- and body-related shame and guilt (WEB-SG) in a sample of 331 obese individuals. The factorial structure of the WEB-SG supported a two-factor conceptualization. The WEB-SG subscales proved to be internally consistent and temporally stable. The construct validity of the subscales was evidenced by a substantial overlap of common variance with other shame and guilt measures. Also, the subscales showed differential correlation patterns to other scales, but were not substantially associated to BMI. Thus, it appears that the frequency of weight-related shame and guilt feelings in obese individuals may be affected by factors other than weight. The second manuscript presents the longitudinal associations among weight-related coping, guilt, and shame in a sample of 98 obese individuals. The study explored the kind and frequency of typical coping situations in which obese individuals become aware of being obese. Individuals reported mostly negative evaluations through others/self, physical exercise situations, or environmental hazards. Again, the perceived distress about those situations did not differ significantly between levels of obesity, but was strongly correlated to weight-related shame and guilt. Excessive body weight itself does not appear to be the determinant of distress about weight-related situations, but cognitive appraisal of the situation. Furthermore, the study sought to determine the predictive utility of weight-related shame and guilt concerning coping responses. Contrary to the hypothesis, weight-related shame at baseline was a significant negative predictor for problem-focused engagement coping, whereas, as expected, weight-related guilt was a significant positive predictor for problem-focused engagement strategies and dietary restraint at follow-up. Finally, weight loss was accompanied by a substantial drop in problem-focused disengagement coping. The study outlined in the third manuscript tested the effects of a consultation using genetic information about obesity on attitudes about weight loss goals, self-blame about eating, and weight-related coping in obese individuals. For that purpose, we chose a longitudinal experimental design with two intervention groups (n1 = 126; n2 = 127) and a control group (n = 98). Independent variables were the experimental variation of the consultation (with and without genetic information), the familial predisposition (at least one parent/sibling obese vs. no parent/sibling obese), and two assessment points (after consultation and 6-month follow-up). Individuals with and without a familial predisposition profited in different ways from a consultation using genetic information about obesity: At follow-up, individuals with a familial predisposition reported mainly a relieving effect in the form of less self-blame about eating. Both experimental groups, independent of the factors Consultation and Familial Predisposition, reported an adjustment to more realistic weight loss goals and a greater satisfaction with a 5% weight loss. Regarding weight change, the less satisfied obese individuals felt about their current weight at baseline, the higher the risk that these individuals had gained weight at follow-up. In summary, a consultation with genetic information about obesity and feedback of the familial susceptibility seem to be helpful especially for obese individuals with a familial predisposition


    This book demonstrates the great efforts aimed at further improving the care of the hemophilia, which may bring further improvement in the quality of life of hemophilia persons and their families

    Network analysis of large scale object oriented software systems

    PhD ThesisThe evolution of software engineering knowledge, technology, tools, and practices has seen progressive adoption of new design paradigms. Currently, the predominant design paradigm is object oriented design. Despite the advocated and demonstrated benefits of object oriented design, there are known limitations of static software analysis techniques for object oriented systems, and there are many current and legacy object oriented software systems that are difficult to maintain using the existing reverse engineering techniques and tools. Consequently, there is renewed interest in dynamic analysis of object oriented systems, and the emergence of large and highly interconnected systems has fuelled research into the development of new scalable techniques and tools to aid program comprehension and software testing. In dynamic analysis, a key research problem is efficient interpretation and analysis of large volumes of precise program execution data to facilitate efficient handling of software engineering tasks. Some of the techniques, employed to improve the efficiency of analysis, are inspired by empirical approaches developed in other fields of science and engineering that face comparable data analysis challenges. This research is focused on application of empirical network analysis measures to dynamic analysis data of object oriented software. The premise of this research is that the methods that contribute significantly to the object collaboration network's structural integrity are also important for delivery of the software system’s function. This thesis makes two key contributions. First, a definition is proposed for the concept of the functional importance of methods of object oriented software. Second, the thesis proposes and validates a conceptual link between object collaboration networks and the properties of a network model with power law connectivity distribution. Results from empirical software engineering experiments on JHotdraw and Google Chrome are presented. The results indicate that five considered standard centrality based network measures can be used to predict functionally important methods with a significant level of accuracy. The search for functional importance of software elements is an essential starting point for program comprehension and software testing activities. The proposed definition and application of network analysis has the potential to improve the efficiency of post release phase software engineering activities by facilitating rapid identification of potentially functionally important methods in object oriented software. These results, with some refinement, could be used to perform change impact prediction and a host of other potentially beneficial applications to improve software engineering techniques

    The Software Vulnerability Ecosystem: Software Development In The Context Of Adversarial Behavior

    Software vulnerabilities are the root cause of many computer system security fail- ures. This dissertation addresses software vulnerabilities in the context of a software lifecycle, with a particular focus on three stages: (1) improving software quality dur- ing development; (2) pre- release bug discovery and repair; and (3) revising software as vulnerabilities are found. The question I pose regarding software quality during development is whether long-standing software engineering principles and practices such as code reuse help or hurt with respect to vulnerabilities. Using a novel data-driven analysis of large databases of vulnerabilities, I show the surprising result that software quality and software security are distinct. Most notably, the analysis uncovered a counterintu- itive phenomenon, namely that newly introduced software enjoys a period with no vulnerability discoveries, and further that this “Honeymoon Effect” (a term I coined) is well-explained by the unfamiliarity of the code to malicious actors. An important consequence for code reuse, intended to raise software quality, is that protections inherent in delays in vulnerability discovery from new code are reduced. The second question I pose is the predictive power of this effect. My experimental design exploited a large-scale open source software system, Mozilla Firefox, in which two development methodologies are pursued in parallel, making that the sole variable in outcomes. Comparing the methodologies using a novel synthesis of data from vulnerability databases, These results suggest that the rapid-release cycles used in agile software development (in which new software is introduced frequently) have a vulnerability discovery rate equivalent to conventional development. Finally, I pose the question of the relationship between the intrinsic security of software, stemming from design and development, and the ecosystem into which the software is embedded and in which it operates. I use the early development lifecycle to examine this question, and again use vulnerability data as the means of answering it. Defect discovery rates should decrease in a purely intrinsic model, with software maturity making vulnerabilities increasingly rare. The data, which show that vulnerability rates increase after a delay, contradict this. Software security therefore must be modeled including extrinsic factors, thus comprising an ecosystem

    Termination, correctness and relative correctness

    Over the last decade, research in verification and formal methods has been the subject of increased interest with the need of more secure and dependable software. At the heart of software dependability is the concept of software fault, defined in the literature as the adjudged or hypothesized cause of an error. This definition, which lacks precision, presents at least two challenges with regard to using formal methods: (1) Adjudging and hypothesizing are highly subjective human endeavors; (2) The concept of error is itself insufficiently defined, since it depends on a detailed characterization of correct system states at each stage of a computation (which is usually unavailable). In the process of defining what a software fault is, the concept of relative correctness, the property of a program to be more-correct than another with respect to a given specification, is discussed. Subsequently, a feature of a program is a fault (for a given specification) only because there exists an alternative to it that would make the program more-correct with respect to the specification. Furthermore, the implications and applications of relative correctness in various software engineering activities are explored. It is then illustrated that in many situations of software testing, fault removal and program repair, testing for relative correctness rather than absolute correctness leads to clearer conclusions and better outcomes. In particular, debugging without testing, a technique whereby, a fault can be removed from a program and the new program proven to be more-correct than the original, all without any testing (and its associated uncertainties/imperfections) is introduced. Given that there are orders of magnitude more incorrect programs than correct programs in use nowadays, this has the potential to expand the scope of proving methods significantly. Another technique, programming without refining, is also introduced. The most important advantage of program derivation by correctness enhancement is that it captures not only program construction from scratch, but also virtually all activities of software evolution. Given that nowadays most software is developed by evolving existing assets rather than producing new assets from scratch, the paradigm of software evolution by correctness enhancements stands to yield significant gains, if we can make it practical