201 research outputs found

    Mining Fix Patterns for FindBugs Violations

    Get PDF
    In this paper, we first collect and track a large number of fixed and unfixed violations across revisions of software. The empirical analyses reveal that there are discrepancies in the distributions of violations that are detected and those that are fixed, in terms of occurrences, spread and categories, which can provide insights into prioritizing violations. To automatically identify patterns in violations and their fixes, we propose an approach that utilizes convolutional neural networks to learn features and clustering to regroup similar instances. We then evaluate the usefulness of the identified fix patterns by applying them to unfixed violations. The results show that developers will accept and merge a majority (69/116) of fixes generated from the inferred fix patterns. It is also noteworthy that the yielded patterns are applicable to four real bugs in the Defects4J major benchmark for software testing and automated repair.Comment: Accepted for IEEE Transactions on Software Engineerin

    Organizing the Technical Debt Landscape

    Get PDF
    To date, several methods and tools for detecting source code and design anomalies have been developed. While each method focuses on identifying certain classes of source code anomalies that potentially relate to technical debt (TD), the overlaps and gaps among these classes and TD have not been rigorously demonstrated. We propose to construct a seminal technical debt landscape as a way to visualize and organize research on the subjec

    Using Automatic Static Analysis to Identify Technical Debt

    Get PDF
    The technical debt (TD) metaphor describes a tradeoff between short-term and long-term goals in software development. Developers, in such situations, accept compromises in one dimension (e.g. maintainability) to meet an urgent demand in another dimension (e.g. delivering a release on time). Since TD produces interests in terms of time spent to correct the code and accomplish quality goals, accumulation of TD in software systems is dangerous because it could lead to more difficult and expensive maintenance. The research presented in this paper is focused on the usage of automatic static analysis to identify Technical Debt at code level with respect to different quality dimensions. The methodological approach is that of Empirical Software Engineering and both past and current achieved results are presented, focusing on functionality, efficiency and maintainabilit

    Investigating Automatic Static Analysis Results to Identify Quality Problems: an Inductive Study

    Get PDF
    Background: Automatic static analysis (ASA) tools examine source code to discover "issues", i.e. code patterns that are symptoms of bad programming practices and that can lead to defective behavior. Studies in the literature have shown that these tools find defects earlier than other verification activities, but they produce a substantial number of false positive warnings. For this reason, an alternative approach is to use the set of ASA issues to identify defect prone files and components rather than focusing on the individual issues. Aim: We conducted an exploratory study to investigate whether ASA issues can be used as early indicators of faulty files and components and, for the first time, whether they point to a decay of specific software quality attributes, such as maintainability or functionality. Our aim is to understand the critical parameters and feasibility of such an approach to feed into future research on more specific quality and defect prediction models. Method: We analyzed an industrial C# web application using the Resharper ASA tool and explored if significant correlations exist in such a data set. Results: We found promising results when predicting defect-prone files. A set of specific Resharper categories are better indicators of faulty files than common software metrics or the collection of issues of all issue categories, and these categories correlate to different software quality attributes. Conclusions: Our advice for future research is to perform analysis on file rather component level and to evaluate the generalizability of categories. We also recommend using larger datasets as we learned that data sparseness can lead to challenges in the proposed analysis proces

    Comparing Four Approaches for Technical Debt Identification

    Get PDF
    Background: Software systems accumulate technical debt (TD) when short-term goals in software development are traded for long term goals (e.g., quick-and-dirty implementation to reach a release date vs. a well-refactored implementation that supports the long term health of the project). Some forms of TD accumulate over time in the form of source code that is difficult to work with and exhibits a variety of anomalies. A number of source code analysis techniques and tools have been proposed to potentially identify the code-level debt accumulated in a system. What has not yet been studied is if using multiple tools to detect TD can lead to benefits, i.e. if different tools will flag the same or different source code components. Further, these techniques also lack investigation into the symptoms of TD "interest" that they lead to. To address this latter question, we also investigated whether TD, as identified by the source code analysis techniques, correlates with interest payments in the form of increased defect- and change-proneness. Aims: Comparing the results of different TD identification approaches to understand their commonalities and differences and to evaluate their relationship to indicators of future TD "interest". Method: We selected four different TD identification techniques (code smells, automatic static analysis (ASA) issues, grime buildup, and modularity violations) and applied them to 13 versions of the Apache Hadoop open source software project. We collected and aggregated statistical measures to investigate whether the different techniques identified TD indicators in the same or different classes and whether those classes in turn exhibited high interest (in the form of a large number of defects and higher change proneness). Results: The outputs of the four approaches have very little overlap and are therefore pointing to different problems in the source code. Dispersed coupling and modularity violations were co-located in classes with higher defect proneness. We also observed a strong relationship between modularity violations and change proneness. Conclusions: Our main contribution is an initial overview of the TD landscape, showing that different TD techniques are loosely coupled and therefore indicate problems in different locations of the source code. Moreover, our proxy interest indicators (change- and defect-proneness) correlate with only a small subset of TD indicator

    Comparing Four Approaches for Technical Debt Identification

    Get PDF
    Background: Software systems accumulate technical debt (TD) when short-term goals in software development are traded for long term goals (e.g., quick-and-dirty implementation to reach a release date vs. a well-refactored implementation that supports the long term health of the project). Some forms of TD accumulate over time in the form of source code that is difficult to work with and exhibits a variety of anomalies. A number of source code analysis techniques and tools have been proposed to potentially identify the code-level debt accumulated in a system. What has not yet been studied is if using multiple tools to detect TD can lead to benefits, i.e. if different tools will flag the same or different source code components. Further, these techniques also lack investigation into the symptoms of TD “interest” that they lead to. To address this latter question, we also investigated whether TD, as identified by the source code analysis techniques, correlates with interest payments in the form of increased defect- and change-proneness. Aims: Comparing the results of different TD identification approaches to understand their commonalities and differences and to evaluate their relationship to indicators of future TD “interest”. Method: We selected four different TD identification techniques (code smells, automatic static analysis (ASA) issues, grime buildup, and modularity violations) and applied them to 13 versions of the Apache Hadoop open source software project. We collected and aggregated statistical measures to investigate whether the different techniques identified TD indicators in the same or different classes and whether those classes in turn exhibited high interest (in the form of a large number of defects and higher change proneness). Results: The outputs of the four approaches have very little overlap and are therefore pointing to different problems in the source code. Dispersed coupling and modularity violations were co-located in classes with higher defect proneness. We also observed a strong relationship between modularity violations and change proneness. Conclusions: Our main contribution is an initial overview of the TD landscape, showing that different TD techniques are loosely coupled and therefore indicate problems in different locations of the source code. Moreover, our proxy interest indicators (change- and defect-proneness) correlate with only a small subset of TD indicators

    Quantitative Assessment of the Impact of Automatic Static Analysis Issues on Time Efficiency

    Get PDF
    Background: Automatic Static Analysis (ASA) tools analyze source code and look for code patterns (aka smells) that might cause defective behavior or might degrade other dimensions of software quality, e.g. efficiency. There are many potentially negative code patterns, and ASA tools typically report a huge list of them even in small programs. Moreover, so far, little evidence is available about the negative impact on performance of code patterns identified by such tools. A consequence is that programmers cannot appreciate the benefits of ASA tools and tend not to include them in their workflow. Aims: Quantitatively assess the impact of issues signaled by ASA tools on time efficiency. Method: We select 20 issues and for each of them we set up two source code fragments: one containing the issue and the corresponding refactored version, functionally identical but without the issue. We set up three different platforms, isolated from network and other user programs, then we execute the code fragments, and measure the execution time of both code versions. Results: We find that eleven issues have an actual negative impact on performance. We also compute for each issue an estimation for the delay provoked by a single execution. Conclusions: We produce a set of issues with a verified negative impact on performance. They can be checked easily with an analysis tool and code can be refactored to obtain a provably more efficient code. We also provide the estimated delay cost of each issue in the environments where we conduct the tests. These results can be improved with the help of other researchers: repeating the tests in several platforms would make it possible to build up a wider benchmar

    Automatically fixing static analysis tools violations

    Get PDF
    Dissertação (mestrado)—Universidade de Brasília, Instituto de Ciências Exatas, Departamento de Ciência da Computação, 2019.A qualidade de software tem se tornado cada vez mais importante à medida que a so- ciedade depende mais de sistemas de software. Defeitos de software podem custar caro à organizações, especialmente quando causam falhas. Ferramentas de análise estática analisam código para encontrar desvios, ou violações, de práticas recomendadas de pro- gramação definidas como regras. Essa análise pode encontrar defeitos de software de forma antecipada, mais rápida e barata, em contraste à inspeções manuais. Para corrigir- se uma violação é necessário que o programador modifique o código problemático. Essas modificações podem ser tediosas, passíveis de erro e repetitivas. Dessa forma, a au- tomação de transformações de código é uma funcionalidade frequentemente requisitada por desenvolvedores. Esse trabalho implementa transformações automáticas para resolver violações identificadas por ferramentas de análise estática. Primeiro, nós investigamos o uso da ferramenta SonarQube, uma ferramenta amplamente utilizada, em duas grandes organizações open-source e duas instituições do Governo Federal do Brasil. Nossos re- sultados mostram que um pequeno subconjunto de regras é responsável por uma grande porção das violações resolvidas. Nós implementamos transformações automáticas para 11 regras do conjunto de regras comumente resolvidas achadas no estudo anterior. Nós submetemos 38 pull requests, incluindo 920 soluções para violações, geradas automati- camente pela nossa técnica para diversos projetos open-source na linguagem Java. Os mantenedores dos projetos aceitaram 84% das nossas transformações, sendo 95% delas sem nenhuma modificação. Esses resultados indicam que nossa abordagem é prática, e pode auxiliar desenvolvedores com resoluções automáticas, uma funcionalidade frequente- mente requisitada.Software quality is becoming more important as the reliance on software systems in- creases. Software defects may have a high cost to organizations as some can lead to software failure. Static analysis tools analyze code to find deviations, or violations, from recommended programming practices defined as rules. This analysis can find software defects earlier, faster, and cheaper than manual inspections. When fixing a violation, a programmer is required to modify the violating code. Such modifications can be tedious, error-prone, and repetitive. Unsurprisingly, automated transformations are frequently re- quested by developers. This work implements automatic transformations tailored to solve violations identified by static analysis tools. First, we investigate the use of SonarQube, a widely used Static Analysis Tool, in two large open source organizations and two Brazil- ian Government Federal Institutions. Our results show that a small subset of the rules is responsible for a large portion of the fixes. We implement automatic fixes for 11 rules from the previously found set of frequently fixed rules. We submitted 38 pull requests, including 920 fixes generated automatically by our technique for various open-source Java projects. Projects maintainers accepted 84% of our fixes (95% of them without any mod- ifications). These results indicate that our approach is feasible, and can aid developers with automatic fixes, a long requested feature
    corecore