5,148 research outputs found

    Open Hybrid Model: A New Ensemble Model for Software Development Cost Estimation

    Get PDF
    Given various features of a software project, it may face different administrative challenges requiring right decisions by software project managers. A major challenge is to estimate software development cost for which different methods have been proposed by many researchers. According to the literature, the capability of a proposed model or method is demonstrated in a specific set of software projects. Hence, the aim of this study is to present a model to take advantage of the capabilities of various software development cost estimation models and methods simultaneously. For this purpose, a new model called "open hybrid model" was proposed based on the firefly algorithm. The proposed model includes an extensible bank of estimation methods. The model also includes an extensible bank of rules to describe the relation between existing methods. Considering project conditions, the proposed model tries to find the best rule for combining estimation methods in the methods bank. Three datasets of real projects were used to evaluate the precision of the proposed model, and the results were compared with those of other 11 methods. The results were compared based on performance parmeters widely used to show the accuracy and stability of estimation models. According to the results, the open hybrid model was able to select the most appropriate methods present in the methods bank

    Software Effort Estimation Accuracy Prediction of Machine Learning Techniques: A Systematic Performance Evaluation

    Full text link
    Software effort estimation accuracy is a key factor in effective planning, controlling and to deliver a successful software project within budget and schedule. The overestimation and underestimation both are the key challenges for future software development, henceforth there is a continuous need for accuracy in software effort estimation (SEE). The researchers and practitioners are striving to identify which machine learning estimation technique gives more accurate results based on evaluation measures, datasets and the other relevant attributes. The authors of related research are generally not aware of previously published results of machine learning effort estimation techniques. The main aim of this study is to assist the researchers to know which machine learning technique yields the promising effort estimation accuracy prediction in the software development. In this paper, the performance of the machine learning ensemble technique is investigated with the solo technique based on two most commonly used accuracy evaluation metrics. We used the systematic literature review methodology proposed by Kitchenham and Charters. This includes searching for the most relevant papers, applying quality assessment criteria, extracting data and drawing results. We have evaluated a state-of-the-art accuracy performance of 28 selected studies (14 ensemble, 14 solo) using Mean Magnitude of Relative Error (MMRE) and PRED (25) as a set of reliable accuracy metrics for performance evaluation of accuracy among two techniques to report the research questions stated in this study. We found that machine learning techniques are the most frequently implemented in the construction of ensemble effort estimation (EEE) techniques. The results of this study revealed that the EEE techniques usually yield a promising estimation accuracy than the solo techniques.Comment: Pages: 27 Figures: 15 Tables:

    Numerical Simulation and Design of Ensemble Learning Based Improved Software Development Effort Estimation System

    Get PDF
    This research paper proposes a novel approach to improving software development effort estimation by integrating ensemble learning algorithms with numerical simulation techniques. The objective of this study is to design an ensemble learning-based software development effort estimation system that leverages the strengths of multiple algorithms to enhance accuracy and reliability. The proposed system combines the power of ensemble learning, which involves aggregating predictions from multiple models, with numerical simulation techniques that enable the modelling and analysis of complex software development processes. A diverse set of software development projects is collected, encompassing various domains, sizes, and complexities. Ensemble learning algorithms such as Random Forest, Gradient Boosting, Bagging, and AdaBoost are selected for their ability to capture different aspects of the data and produce robust predictions. The proposed system architecture is presented, illustrating the flow of data and components. A model training and evaluation pipeline is developed, enabling the integration of ensemble learning and numerical simulation modules. The system combines the predictions generated by the ensemble models with the simulation results to produce more accurate and reliable effort estimates. The experimental setup involves a comprehensive evaluation of the proposed system. A real-world dataset comprising historical project data is utilized, and various performance metrics, including Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE), are employed to assess the effectiveness of the system. The results and analysis demonstrate that the ensemble learning-based effort estimation system outperforms traditional techniques, showcasing its potential to enhance project planning and resource allocation

    Ensemble missing data techniques for software effort prediction

    Get PDF
    Constructing an accurate effort prediction model is a challenge in software engineering. The development and validation of models that are used for prediction tasks require good quality data. Unfortunately, software engineering datasets tend to suffer from the incompleteness which could result to inaccurate decision making and project management and implementation. Recently, the use of machine learning algorithms has proven to be of great practical value in solving a variety of software engineering problems including software prediction, including the use of ensemble (combining) classifiers. Research indicates that ensemble individual classifiers lead to a significant improvement in classification performance by having them vote for the most popular class. This paper proposes a method for improving software effort prediction accuracy produced by a decision tree learning algorithm and by generating the ensemble using two imputation methods as elements. Benchmarking results on ten industrial datasets show that the proposed ensemble strategy has the potential to improve prediction accuracy compared to an individual imputation method, especially if multiple imputation is a component of the ensemble

    Class-Level Refactoring Prediction by Ensemble Learning with Various Feature Selection Techniques

    Get PDF
    Background: Refactoring is changing a software system without affecting the software functionality. The current researchers aim i to identify the appropriate method(s) or class(s) that needs to be refactored in object-oriented software. Ensemble learning helps to reduce prediction errors by amalgamating different classifiers and their respective performances over the original feature data. Other motives are added in this paper regarding several ensemble learners, errors, sampling techniques, and feature selection techniques for refactoring prediction at the class level. Objective: This work aims to develop an ensemble-based refactoring prediction model with structural identification of source code metrics using different feature selection techniques and data sampling techniques to distribute the data uniformly. Our model finds the best classifier after achieving fewer errors during refactoring prediction at the class level. Methodology: At first, our proposed model extracts a total of 125 software metrics computed from object-oriented software systems processed for a robust multi-phased feature selection method encompassing Wilcoxon significant text, Pearson correlation test, and principal component analysis (PCA). The proposed multi-phased feature selection method retains the optimal features characterizing inheritance, size, coupling, cohesion, and complexity. After obtaining the optimal set of software metrics, a novel heterogeneous ensemble classifier is developed using techniques such as ANN-Gradient Descent, ANN-Levenberg Marquardt, ANN-GDX, ANN-Radial Basis Function; support vector machine with different kernel functions such as LSSVM-Linear, LSSVM-Polynomial, LSSVM-RBF, Decision Tree algorithm, Logistic Regression algorithm and extreme learning machine (ELM) model are used as the base classifier. In our paper, we have calculated four different errors i.e., Mean Absolute Error (MAE), Mean magnitude of Relative Error (MORE), Root Mean Square Error (RMSE), and Standard Error of Mean (SEM). Result: In our proposed model, the maximum voting ensemble (MVE) achieves better accuracy, recall, precision, and F-measure values (99.76, 99.93, 98.96, 98.44) as compared to the base trained ensemble (BTE) and it experiences less errors (MAE = 0.0057, MORE = 0.0701, RMSE = 0.0068, and SEM = 0.0107) during its implementation to develop the refactoring model. Conclusions: Our experimental result recommends that MVE with upsampling can be implemented to improve the performance of the refactoring prediction model at the class level. Furthermore, the performance of our model with different data sampling techniques and feature selection techniques has been shown in the form boxplot diagram of accuracy, F-measure, precision, recall, and area under the curve (AUC) parameters.publishedVersio

    Predicting software faults in large space systems using machine learning techniques

    Get PDF
    Recently, the use of machine learning (ML) algorithms has proven to be of great practical value in solving a variety of engineering problems including the prediction of failure, fault, and defect-proneness as the space system software becomes complex. One of the most active areas of recent research in ML has been the use of ensemble classifiers. How ML techniques (or classifiers) could be used to predict software faults in space systems, including many aerospace systems is shown, and further use ensemble individual classifiers by having them vote for the most popular class to improve system software fault-proneness prediction. Benchmarking results on four NASA public datasets show the Naive Bayes classifier as more robust software fault prediction while most ensembles with a decision tree classifier as one of its components achieve higher accuracy rates

    Requirements Prioritization Based on Benefit and Cost Prediction: A Method Classification Framework

    Get PDF
    In early phases of the software development process, requirements prioritization necessarily relies on the specified requirements and on predictions of benefit and cost of individual requirements. This paper induces a conceptual model of requirements prioritization based on benefit and cost. For this purpose, it uses Grounded Theory. We provide a detailed account of the procedures and rationale of (i) how we obtained our results and (ii) how we used them to form the basis for a framework for classifying requirements prioritization methods
    • 

    corecore