Search CORE

1,296 research outputs found

A Novel Approach to Determine Software Security Level using Bayes Classifier via Static Code Metrics

Author: Sarıman Guncel
Ugur Kucuksille Ecir
Publication venue
Publication date: 21/12/2017
Field of study

Technological developments are increasing day by day and software products are growing in an uncontrolled way. This leads to the development of applications which do not comply with principles of design. Software which has not passed security testing may put the end user into danger. During the processes of error detection and verification of developed software, static and dynamic analysis may be used. Static code analysis provides analysis in different categories while coding without code compile. Source code metrics are also within these categories. Code metrics evaluate software quality, level of risk, and interchangeability by analysing software based on those metrics. In this study, we will describe our web-based application which is developed to determine the level of security in software. In this scope, software's metric calculation method will be explained. The scoring system we used to determine the security level calculation will be explained, taking into account metric thresholds that are acceptable in the literature. Bayes Classifier Method, distinguishing risks in the project files with the analysis of uploaded sample software files, will be described. Finally, objectives of this analysis method and planned activities will be explained

ZENODO

Defect prediction with bad smells in code

Author: Dąbrowska Marta
Hryszko Jarosław
Konopka Piotr
Madeyski Lech
Publication venue
Publication date: 18/03/2017
Field of study

Background: Defect prediction in software can be highly beneficial for development projects, when prediction is highly effective and defect-prone areas are predicted correctly. One of the key elements to gain effective software defect prediction is proper selection of metrics used for dataset preparation. Objective: The purpose of this research is to verify, whether code smells metrics, collected using Microsoft CodeAnalysis tool, added to basic metric set, can improve defect prediction in industrial software development project. Results: We verified, if dataset extension by the code smells sourced metrics, change the effectiveness of the defect prediction by comparing prediction results for datasets with and without code smells-oriented metrics. In a result, we observed only small improvement of effectiveness of defect prediction when dataset extended with bad smells metrics was used: average accuracy value increased by 0.0091 and stayed within the margin of error. However, when only use of code smells based metrics were used for prediction (without basic set of metrics), such process resulted with surprisingly high accuracy (0.8249) and F-measure (0.8286) results. We also elaborated data anomalies and problems we observed when two different metric sources were used to prepare one, consistent set of data. Conclusion: Extending the dataset by the code smells sourced metric does not significantly improve the prediction effectiveness. Achieved result did not compensate effort needed to collect additional metrics. However, we observed that defect prediction based on the code smells only is still highly effective and can be used especially where other metrics hardly be used.Comment: Chapter 10 in Software Engineering: Improving Practice through Research (B. Hnatkowska and M. \'Smia{\l}ek, eds.), pp. 163-176, 201

arXiv.org e-Print Archive

Towards Surgically-Precise Technical Debt Estimation: Early Results and Research Roadmap

Author: Lenarduzzi Valentina
Martini Antonio
Taibi Davide
Tamburri Damian Andrew
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

The concept of technical debt has been explored from many perspectives but its precise estimation is still under heavy empirical and experimental inquiry. We aim to understand whether, by harnessing approximate, data-driven, machine-learning approaches it is possible to improve the current techniques for technical debt estimation, as represented by a top industry quality analysis tool such as SonarQube. For the sake of simplicity, we focus on relatively simple regression modelling techniques and apply them to modelling the additional project cost connected to the sub-optimal conditions existing in the projects under study. Our results shows that current techniques can be improved towards a more precise estimation of technical debt and the case study shows promising results towards the identification of more accurate estimation of technical debt.Comment: 6 page

arXiv.org e-Print Archive

Crossref

Trepo - Institutional Repository of Tampere University

Predicting code refactoring via analyzing the history of quality metrics and code anti-patterns

Author: Alanqari Sarah
Publication venue: DePaul University
Publication date: 17/03/2023
Field of study

Code refactoring is the process of improving the internal structure of existing code without altering its functionality. Refactoring can help to reduce technical debt, enhance the quality of the code and make the code easy to evolve. However, the manual identification of the proper code refactoring operations to apply can be time-consuming and not scalable. In this thesis, we propose an approach based on data mining and machine learning techniques to analyze historical data and predict refactoring operations that may occur in a future release of a project. The approach uses a combination of techniques to identify patterns in the data and make predictions about which refactoring operations should be applied. In this study, we validated the proposed machine learning based approaches with 13 open-source projects with different releases. We identified the refactoring operations and code smells and extracted the quality metrics for each project release. We used the collected data (e.g. quality metrics and code smells) to predict refactoring operations, and we reported the prediction results based on cross- validation procedures. The proposed research contributes to the field of software quality by providing an efficient and effective approach to refactor the code. The findings of this research will also help developers by suggesting appropriate refactoring operations based on the history of the evolution of software projects. This will ultimately result in improved software quality, reduced technical debt, and enhanced software performance

Via Sapientiae: The Institutional Repository at DePaul University

Class-Level Refactoring Prediction by Ensemble Learning with Various Feature Selection Techniques

Author: Kuanar Sanjay Kumar
Kumar Lov
Misra Sanjay
Panigrahi Rasmita
Publication venue: 'MDPI AG'
Publication date: 01/01/2022
Field of study

Background: Refactoring is changing a software system without affecting the software functionality. The current researchers aim i to identify the appropriate method(s) or class(s) that needs to be refactored in object-oriented software. Ensemble learning helps to reduce prediction errors by amalgamating different classifiers and their respective performances over the original feature data. Other motives are added in this paper regarding several ensemble learners, errors, sampling techniques, and feature selection techniques for refactoring prediction at the class level. Objective: This work aims to develop an ensemble-based refactoring prediction model with structural identification of source code metrics using different feature selection techniques and data sampling techniques to distribute the data uniformly. Our model finds the best classifier after achieving fewer errors during refactoring prediction at the class level. Methodology: At first, our proposed model extracts a total of 125 software metrics computed from object-oriented software systems processed for a robust multi-phased feature selection method encompassing Wilcoxon significant text, Pearson correlation test, and principal component analysis (PCA). The proposed multi-phased feature selection method retains the optimal features characterizing inheritance, size, coupling, cohesion, and complexity. After obtaining the optimal set of software metrics, a novel heterogeneous ensemble classifier is developed using techniques such as ANN-Gradient Descent, ANN-Levenberg Marquardt, ANN-GDX, ANN-Radial Basis Function; support vector machine with different kernel functions such as LSSVM-Linear, LSSVM-Polynomial, LSSVM-RBF, Decision Tree algorithm, Logistic Regression algorithm and extreme learning machine (ELM) model are used as the base classifier. In our paper, we have calculated four different errors i.e., Mean Absolute Error (MAE), Mean magnitude of Relative Error (MORE), Root Mean Square Error (RMSE), and Standard Error of Mean (SEM). Result: In our proposed model, the maximum voting ensemble (MVE) achieves better accuracy, recall, precision, and F-measure values (99.76, 99.93, 98.96, 98.44) as compared to the base trained ensemble (BTE) and it experiences less errors (MAE = 0.0057, MORE = 0.0701, RMSE = 0.0068, and SEM = 0.0107) during its implementation to develop the refactoring model. Conclusions: Our experimental result recommends that MVE with upsampling can be implemented to improve the performance of the refactoring prediction model at the class level. Furthermore, the performance of our model with different data sampling techniques and feature selection techniques has been shown in the form boxplot diagram of accuracy, F-measure, precision, recall, and area under the curve (AUC) parameters.publishedVersio

Directory of Open Access Journals

HIØ Brage