10 research outputs found

    Software Development Effort Estimation Using Regression Fuzzy Models

    Full text link
    Software effort estimation plays a critical role in project management. Erroneous results may lead to overestimating or underestimating effort, which can have catastrophic consequences on project resources. Machine-learning techniques are increasingly popular in the field. Fuzzy logic models, in particular, are widely used to deal with imprecise and inaccurate data. The main goal of this research was to design and compare three different fuzzy logic models for predicting software estimation effort: Mamdani, Sugeno with constant output and Sugeno with linear output. To assist in the design of the fuzzy logic models, we conducted regression analysis, an approach we call regression fuzzy logic. State-of-the-art and unbiased performance evaluation criteria such as standardized accuracy, effect size and mean balanced relative error were used to evaluate the models, as well as statistical tests. Models were trained and tested using industrial projects from the International Software Benchmarking Standards Group (ISBSG) dataset. Results showed that data heteroscedasticity affected model performance. Fuzzy logic models were found to be very sensitive to outliers. We concluded that when regression analysis was used to design the model, the Sugeno fuzzy inference system with linear output outperformed the other models.Comment: This paper has been accepted in January 2019 in Computational Intelligence and Neuroscience Journal (In Press

    COMPARATIVE ANALYSIS OF SOFTWARE EFFORT ESTIMATION USING DATA MINING TECHNIQUE AND FEATURE SELECTION

    Get PDF
    Software development involves several interrelated factors that influence development efforts and productivity. Improving the estimation techniques available to project managers will facilitate more effective time and budget control in software development. Software Effort Estimation or software cost/effort estimation can help a software development company to overcome difficulties experienced in estimating software development efforts. This study aims to compare the Machine Learning method of Linear Regression (LR), Multilayer Perceptron (MLP), Radial Basis Function (RBF), and Decision Tree Random Forest (DTRF) to calculate estimated cost/effort software. Then these five approaches will be tested on a dataset of software development projects as many as 10 dataset projects. So that it can produce new knowledge about what machine learning and non-machine learning methods are the most accurate for estimating software business. As well as knowing between the selection between using Particle Swarm Optimization (PSO) for attributes selection and without PSO, which one can increase the accuracy for software business estimation. The data mining algorithm used to calculate the most optimal software effort estimate is the Linear Regression algorithm with an average RMSE value of 1603,024 for the 10 datasets tested. Then using the PSO feature selection can increase the accuracy or reduce the RMSE average value to 1552,999. The result indicates that, compared with the original regression linear model, the accuracy or error rate of software effort estimation has increased by 3.12% by applying PSO feature selectio

    An Empirical Analysis on Software Development Efforts Estimation in Machine Learning Perspective

    Get PDF
    The prediction of effort estimation is a vital factor in the success of any software development project. The available of expert systems for the software effort estimation supports in minimization of effort and cost for every software project at same time leads to timely completion and proper resource management of the project. This article supports software project managers and decision makers by providing the state-of-the-art empirical analysis of effort estimation methods based on machine learning approaches. In this paper ?ve machine learning techniques; polynomial linear regression, ridge regression, decision trees, support vector regression and Multilayer Perceptron (MLP) are investigated for the purpose software development effort estimation by using bench mark publicly available data sets. The empirical performance of machine learning methods for software effort estimation is investigated on seven standard data sets i.e. Albretch, Desharnais, COCOMO81, NASA, Kemerer, China and Kitchenham. Furthermore, the performance of software effort estimation approaches are evaluated statistically applying the performance metrics i.e. MMRE, PRED (25), R2-score, MMRE, Pred(25). The empirical results reveal that the decision tree-based techniques on Deshnaris, COCOMO, China and kitchenham data sets produce more adequate results in terms of all three-performance metrics. On the Albretch and nasa datasets, the ridge regression method outperformed then other techniques except pred(25) metric where decision trees performed better

    Artificial intelligence in project management: a brief systematic literature review

    Get PDF
    Project management is a common field in many industries, and it is not immune to the innovations that artificial intelligence is bringing to the world. Even so the application of artificial intelligence is not that widespread in companies and especially not in all of project management areas. The reasons are not clear but seem to be related to the uncertainty of the application of artificial intelligence in project management. The purpose was to acknowledge the potentialities and limitations of artificial intelligence in the specific area of project management by doing a systematic literature review with which it was possible to analyse and correlate the selected articles and reach some patterns and tendencies. In the end it was clear the increased interest in the scientific community in this field, although with some areas to explore.A gestão de projetos é uma área comum a muitos setores e não está imune às inovações que a inteligência artificial está promovendo no mundo. Ainda assim a aplicação da inteligência artificial ainda não está muito difundida nas empresas e principalmente não em todas as áreas de gestão de projetos. As razões não são claras, mas aparentam estar relacionadas com a incerteza da aplicação da inteligência artificial na gestão de projetos. O objetivo foi entender as potencialidades e limitações da inteligência artificial na área específica de gestão de projetos por meio de uma revisão sistemática da literatura com a qual seja possível analisar e correlacionar os artigos selecionados e obter eventualmente alguns padrões e tendências. No final ficou claro que há um crescente interesse da comunidade científica por esta área, embora com alguns âmbitos por explorar

    Predictive Data Analysis Using Linear Regression and Random Forest

    Get PDF
    A statistical technique called predictive analysis (or analytics) makes use of machine learning and computers to find patterns in data and forecasts future actions. It is now preferred to go beyond descriptive analytics in order to learn whether training initiatives are effective and how they may be enhanced. Data from the past as well as the present can be used in predictive analysis to make predictions about what might occur in the future. Businesses can improve upcoming learning projects by taking actionable action after identifying the potential risks or possibilities. This chapter compares two predictive analysis models used in the predictive analysis of data: the Generalized Linear Model with Linear Regression (LR) and the Decision Trees with Random Forest (RF). With an RMSE (Root Mean Square Error) of 0.0264965 and an arithmetic mean for all errors of 0.016056967, Linear Regression did better in this analysis than Random Forest, which had an RMSE of 0.117875 and an arithmetic mean for all errors of 0.07062315. Through the hyper-parameter tuning procedure, these percentage errors can still be decreased. The combined strategy of combining LR and RF predictions, by averaging, nevertheless produced even more accurate predictions and will overcome the danger of over-fitting and producing incorrect predictions by individual algorithms, depending on the quality of data used for the training

    Using actors and use cases for software size estimation

    Get PDF
    Software size estimation represents a complex task, which is based on data analysis or on an algorithmic estimation approach. Software size estimation is a nontrivial task, which is important for software project planning and management. In this paper, a new method called Actors and Use Cases Size Estimation is proposed. The new method is based on the number of actors and use cases only. The method is based on stepwise regression and led to a very significant reduction in errors when estimating the size of software systems compared to Use Case Points-based meth-ods. The proposed method is independent of Use Case Points, which allows the elimination of the effect of the inaccurate determination of Use Case Points components, because such components are not used in the proposed method. © 2021 by the author. Licensee MDPI, Basel, Switzerland.Faculty of Applied Informatics, Tomas Bata University in Zli

    Procena napora i troškova za razvoj softverskih projekata pomoću veštačkih neuronskih mreža zasnovanih na Tagučijevim ortogonalnim vektorskim planovima

    Get PDF
    The modern software industry requires fast, highquality, and accurate forecasting of efforts and costs before the actual effort is invested in realizing the software product. Such requirements are a challenge for any software company, which must be ready to meet the expectations of the software customer. The main factor in the successful development of software projects and reducing the risk of errors is an adequate of the effort and costs invested during its implementation. In this doctoral dissertation, different approaches and models that have not been sufficiently precise and efficient so far will be analyzed, which resulted in only about 30% of successfully implemented software solutions. The main goal is to present three new, improved models based on efficient artificial intelligence tools, artificial neural networks. All three improved models use different architectures of artificial neural networks (ANN), constructed based on Taguchi's orthogonal vector plans. The goal is to optimize the improved models to avoid repeating the number of experiments and long time for their training. Applying the clustering method to several different sets of real projects further mitigates their heterogeneous structure. In addition, the input values of the projects are homogenized by the method of fuzzification, which achieves even greater reliability and accuracy of the obtained results. Optimization by the Taguchi method and increasing the coverage of a wide range of different projects leads to the efficient and successful completion of as many different software projects as possible. The main contributions of this paper are: constructing and identifying the best model for estimating effort and cost, selecting the best ANN architecture whose values converge the fastest to the minimum magnitude relative error, achieving a small number of experiments, reduced software effort estimation time due to convergence rate. Additional criteria and constraints are introduced to monitor and execute experiments using a precise algorithm to execute all three new proposed models. In addition to monitoring the convergence rate of each artificial neural network architecture, the influence of the input values of each model on the change in the value of the magnitude relative error of the model is also monitored. The models constructed in this way have been experimentally checked and confirmed several times on different sets of real projects and can be practically applied, and the obtained results indicate that the achieved error values are lower than those presented so far. Therefore, the proposed models in this dissertation can be reliably applied and used to assess the efforts and costs for software development and projects in other areas of industry and science.Savremena softverska industrija zahteva brzo, kvalitetno i precizno predviđanje napora i troškova, pre nego što se stvarni napor uloži u realizaciju softverskog proizvoda. Ovako postavljeni zahtevi predstavljaju izazov za bilokoju softversku kompaniju, koja mora biti spremna da ispuni postavljenaočekivanja naručioca softvera. Glavni faktor uspešnog razvoja softverskihprojekata i smanjenja rizika od grešaka je adekvatna procena uloženog naporai troškova tokom njegove realizacije. U ovoj doktorskoj disertaciji biće analizirani dosadašnji različitih pristupi i modeli koji nisu u najvećoj meri bilidovoljno precizni i efikasni, što je za posledicu imalo samo oko 30% uspešnorealizovanih softverskih rešenja.Glavni cilj biće predstavljanje tri nova poboljšana modela zasnovana na efikasnim alatima veštačke inteligencije, veštačkim neuronskim mrežama (engl. Artificial neural networks/ANN). Sva tri poboljšana modela koristerazličite arhitekture veštačkih neuronskih mreža, konstruisanih na osnovuTagučijevih ortogonalnih vektorskih planova. Cilj je optimizacija poboljšanihmodela kako bi se izbeglo ponavljanje broja eksperimenata i dugotrajno vremeza njihovo obučavanje, odnosno treniranje. Primenom metode klasterizacije naviše različitih skupova realnih projekata dodatno se ublažava njihova heterogena struktura. Dodatno, ulazne vrednosti projekata se homogenizuju metodom fazifikacije čime se postiže još veća pouzdanost i tačnost dobijenih  rezultata. Optimizacija Tagučijevom metodom uz povećanje pokrivenosti širokog spektra različitih projekata, dovodi do efikasnog i uspešnog dovršavanja što više različitih softverskih projekta. Glavni doprinosi ove disertacije su: konstruisanje i identifikovanje najboljeg modela za procenu napora i troškova, odabir najbolje ANN arhitekture čije vrednosti najbrže konvergiraju minimalnoj magnitudnoj relativnoj greški, postizanje malog broja izvedenih eksperimenata, smanjeno vreme procene softverskog napora zbog  stope konvergencije. Uvode se i dodatni kriterijumi i ograničenja za nadgledanje i izvršavanje eksperimenata pomoću preciznog algoritma za izvršavanje nad sva tri nova predložena modela. Pored praćenja brzine konvergencije svake arhitekture veštačke neuronske mreže, prati se i uticaj ulaznih veličina svakog modela na promenu vrednosti magnitudne relativne greške modela. Na ovakav način konstruisani modeli eksperimentalno su više puta proveravani i potvrđeni na različitim skupovima realnih projekata i mogu se praktično primenjivati, a dobijeni rezultati ukazuju da su postignute vrednosti  grešaka niže od dosadašnjih predstavljenih. Samim tim se predloženi modeli u ovoj disertaciji mogu pouzdano primenjivati i koristiti ne samo za procenu  napora i troškova za razvoj softverskih već i za razvoj projekata u drugim oblastima industrije i nauk

    Software Development Effort Estimation Using Regression Fuzzy Models

    No full text
    Software effort estimation plays a critical role in project management. Erroneous results may lead to overestimating or underestimating effort, which can have catastrophic consequences on project resources. Machine-learning techniques are increasingly popular in the field. Fuzzy logic models, in particular, are widely used to deal with imprecise and inaccurate data. The main goal of this research was to design and compare three different fuzzy logic models for predicting software estimation effort: Mamdani, Sugeno with constant output, and Sugeno with linear output. To assist in the design of the fuzzy logic models, we conducted regression analysis, an approach we call “regression fuzzy logic.” State-of-the-art and unbiased performance evaluation criteria such as standardized accuracy, effect size, and mean balanced relative error were used to evaluate the models, as well as statistical tests. Models were trained and tested using industrial projects from the International Software Benchmarking Standards Group (ISBSG) dataset. Results showed that data heteroscedasticity affected model performance. Fuzzy logic models were found to be very sensitive to outliers. We concluded that when regression analysis was used to design the model, the Sugeno fuzzy inference system with linear output outperformed the other models
    corecore