Search CORE

1,511 research outputs found

Are Multi-language Design Smells Fault-prone? An Empirical Study

Author: Abidi Mouna
Khomh Foutse
Openja Moses
Rahman Md Saidur
Publication venue
Publication date: 02/11/2020
Field of study

Nowadays, modern applications are developed using components written in different programming languages. These systems introduce several advantages. However, as the number of languages increases, so does the challenges related to the development and maintenance of these systems. In such situations, developers may introduce design smells (i.e., anti-patterns and code smells) which are symptoms of poor design and implementation choices. Design smells are defined as poor design and coding choices that can negatively impact the quality of a software program despite satisfying functional requirements. Studies on mono-language systems suggest that the presence of design smells affects code comprehension, thus making systems harder to maintain. However, these studies target only mono-language systems and do not consider the interaction between different programming languages. In this paper, we present an approach to detect multi-language design smells in the context of JNI systems. We then investigate the prevalence of those design smells. Specifically, we detect 15 design smells in 98 releases of nine open-source JNI projects. Our results show that the design smells are prevalent in the selected projects and persist throughout the releases of the systems. We observe that in the analyzed systems, 33.95% of the files involving communications between Java and C/C++ contains occurrences of multi-language design smells. Some kinds of smells are more prevalent than others, e.g., Unused Parameters, Too Much Scattering, Unused Method Declaration. Our results suggest that files with multi-language design smells can often be more associated with bugs than files without these smells, and that specific smells are more correlated to fault-proneness than others

arXiv.org e-Print Archive

PolyPublie

Predicting Software Fault Proneness Using Machine Learning

Author: Ghanathey Sanjay
Publication venue: Scholarship@Western
Publication date: 19/12/2018
Field of study

Context: Continuous Integration (CI) is a DevOps technique which is widely used in practice. Studies show that its adoption rates will increase even further. At the same time, it is argued that maintaining product quality requires extensive and time consuming, testing and code reviews. In this context, if not done properly, shorter sprint cycles and agile practices entail higher risk for the quality of the product. It has been reported in literature [68], that lack of proper test strategies, poor test quality and team dependencies are some of the major challenges encountered in continuous integration and deployment. Objective: The objective of this thesis, is to bridge the process discontinuity that exists between development teams and testing teams, due to continuous deployments and shorter sprint cycles, by providing a list of potentially buggy or high risk files, which can be used by testers to prioritize code inspection and testing, reducing thus the time between development and release. Approach: Out approach is based on a five step process. The first step is to select a set of systems, a set of code metrics, a set of repository metrics, and a set of machine learning techniques to consider for training and evaluation purposes. The second step is to devise appropriate client programs to extract and denote information obtained from GitHub repositories and source code analyzers. The third step is to use this information to train the models using the selected machine learning techniques. This step allowed to identify the best performing machine learning techniques out of the initially selected in the first step. The fourth step is to apply the models with a voting classifier (with equal weights) and provide answers to five research questions pertaining to the prediction capability and generality of the obtained fault proneness prediction framework. The fifth step is to select the best performing predictors and apply it to two systems written in a completely different language (C++) in order to evaluate the performance of the predictors in a new environment. Obtained Results: The obtained results indicate that a) The best models were the ones applied on the same system as the one trained on; b) The models trained using repository metrics outperformed the ones trained using code metrics; c) The models trained using code metrics were proven not adequate for predicting fault prone modules; d) The use of machine learning as a tool for building fault-proneness prediction models is promising, but still there is work to be done as the models show weak to moderate prediction capability. Conclusion: This thesis provides insights into how machine learning can be used to predict whether a source code file contains one or more faults that may contribute to a major system failure. The proposed approach is utilizing information extracted both from the system’s source code, such as code metrics, and from a series of DevOps tools, such as bug repositories, version control systems and, testing automation frameworks. The study involved five Java and five Python systems and indicated that machine learning techniques have potential towards building models for alerting developers about failure prone code

Scholarship@Western

Technical Debt Decision-Making Framework

Author: Codabux Zadia
Publication venue: Scholars Junction
Publication date: 09/12/2016
Field of study

Software development companies strive to produce high-quality software. In commercial software development environments, due to resource and time constraints, software is often developed hastily which gives rise to technical debt. Technical debt refers to the consequences of taking shortcuts when developing software. These consequences include making the system difficult to maintain and defect prone. Technical debt can have financial consequences and impede feature enhancements. Identifying technical debt and deciding which debt to address is challenging given resource constraints. Project managers must decide which debt has the highest priority and is most critical to the project. This decision-making process is not standardized and sometimes differs from project to project. My research goal is to develop a framework that project managers can use in their decision-making process to prioritize technical debt based on its potential impact. To achieve this goal, we survey software practitioners, conduct literature reviews, and mine software repositories for historical data to build a framework to model the technical debt decision-making process and inform practitioners of the most critical debt items

Scholars Junction - Mississippi State University Institutional Repository

Technical Debt Decision-Making Framework

Author: Codabux Zadia
Publication venue: Scholars Junction
Publication date: 21/11/2016
Field of study

Mississippi State University Libraries ETD database

Scholars Junction - Mississippi State University Institutional Repository

A machine and deep learning analysis among SonarQube rules, product, and process metrics for fault prediction

Author: Lenarduzzi Valentina
Lomio Francesco
Moreschini Sergio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Background: Developers spend more time fixing bugs refactoring the code to increase the maintainability than developing new features. Researchers investigated the code quality impact on fault-proneness, focusing on code smells and code metrics. Objective: We aim at advancing fault-inducing commit prediction using different variables, such as SonarQube rules, product, process metrics, and adopting different techniques. Method: We designed and conducted an empirical study among 29 Java projects analyzed with SonarQube and SZZ algorithm to identify fault-inducing and fault-fixing commits, computing different product and process metrics. Moreover, we investigated fault-proneness using different Machine and Deep Learning models. Results: We analyzed 58,125 commits containing 33,865 faults and infected by more than 174 SonarQube rules violated 1.8M times, on which 48 software product and process metrics were calculated. Results clearly identified a set of features that provided a highly accurate fault prediction (more than 95% AUC). Regarding the performance of the classifiers, Deep Learning provided a higher accuracy compared with Machine Learning models. Conclusion: Future works might investigate whether other static analysis tools, such as FindBugs or Checkstyle, can provide similar or different results. Moreover, researchers might consider the adoption of time series analysis and anomaly detection techniques.publishedVersionPeer reviewe

University of Oulu Repository - Jultika

Trepo - Institutional Repository of Tampere University

Improving software engineering processes using machine learning and data mining techniques

Author: Castelluccio Marco
Publication venue
Publication date: 11/12/2018
Field of study

The availability of large amounts of data from software development has created an area of research called mining software repositories. Researchers mine data from software repositories both to improve understanding of software development and evolution, and to empirically validate novel ideas and techniques. The large amount of data collected from software processes can then be leveraged for machine learning applications. Indeed, machine learning can have a large impact in software engineering, just like it has had in other fields, supporting developers, and other actors involved in the software development process, in automating or improving parts of their work. The automation can not only make some phases of the development process less tedious or cheaper, but also more efficient and less prone to errors. Moreover, employing machine learning can reduce the complexity of difficult problems, enabling engineers to focus on more interesting problems rather than the basics of development. The aim of this dissertation is to show how the development and the use of machine learning and data mining techniques can support several software engineering phases, ranging from crash handling, to code review, to patch uplifting, to software ecosystem management. To validate our thesis we conducted several studies tackling different problems in an industrial open-source context, focusing on the case of Mozilla

Università degli Studi di Napoli Federico Il Open Archive