1,153 research outputs found

    Learning From Mistakes: Machine Learning Enhanced Human Expert Effort Estimates

    Get PDF
    In this paper, we introduce a novel approach to predictive modeling for software engineering, named Learning From Mistakes (LFM). The core idea underlying our proposal is to automatically learn from past estimation errors made by human experts, in order to predict the characteristics of their future misestimates, therefore resulting in improved future estimates. We show the feasibility of LFM by investigating whether it is possible to predict the type, severity and magnitude of errors made by human experts when estimating the development effort of software projects, and whether it is possible to use these predictions to enhance future estimations. To this end we conduct a thorough empirical study investigating 402 maintenance and new development industrial software projects. The results of our study reveal that the type, severity and magnitude of errors are all, indeed, predictable. Moreover, we find that by exploiting these predictions, we can obtain significantly better estimates than those provided by random guessing, human experts and traditional machine learners in 31 out of the 36 cases considered (86%), with large and very large effect sizes in the majority of these cases (81%). This empirical evidence opens the door to the development of techniques that use the power of machine learning, coupled with the observation that human errors are predictable, to support engineers in estimation tasks rather than replacing them with machine-provided estimates

    An Update on Effort Estimation in Agile Software Development: A Systematic Literature Review

    Full text link
    [EN] Software developers require effective effort estimation models to facilitate project planning. Although Usman et al. systematically reviewed and synthesized the effort estimation models and practices for Agile Software Development (ASD) in 2014, new evidence may provide new perspectives for researchers and practitioners. This article presents a systematic literature review that updates the Usman et al. study from 2014 to 2020 by analyzing the data extracted from 73 new papers. This analysis allowed us to identify six agile methods: Scrum, Xtreme Programming and four others, in all of which expert-based estimation methods continue to play an important role. This is particularly the case of Planning Poker, which is very closely related to the most frequently used size metric (story points) and the way in which software requirements are specified in ASD. There is also a remarkable trend toward studying techniques based on the intensive use of data. In this respect, although most of the data originate from single-company datasets, there is a significant increase in the use of cross-company data. With regard to cost factors, we applied the thematic analysis method. The use of team and project factors appears to be more frequent than the consideration of more technical factors, in accordance with agile principles. Finally, although accuracy is still a challenge, we identified that improvements have been made. On the one hand, an increasing number of papers showed acceptable accuracy values, although many continued to report inadequate results. On the other, almost 29% of the papers that reported the accuracy metric used reflected aspects concerning the validation of the models and 18% reported the effect size when comparing models.This work was supported by the Spanish Ministry of Science, Innovation and Universities through the Adapt@Cloud Project under Grant TIN2017-84550-R.Fernández-Diego, M.; Méndez, ER.; González-Ladrón-De-Guevara, F.; Abrahao Gonzales, SM.; Insfran, E. (2020). An Update on Effort Estimation in Agile Software Development: A Systematic Literature Review. IEEE Access. 8:166768-166800. https://doi.org/10.1109/ACCESS.2020.3021664S166768166800

    Pragmatic cost estimation for web applications

    Get PDF
    Cost estimation for web applications is an interesting and difficult challenge for researchers and industrial practitioners. It is a particularly valuable area of ongoing commercial research. Attaining on accurate cost estimation for web applications is an essential element in being able to provide competitive bids and remaining successful in the market. The development of prediction techniques over thirty years ago has contributed to several different strategies. Unfortunately there is no collective evidence to give substantial advice or guidance for industrial practitioners. Therefore to address this problem, this thesis shows the way by investigating the characteristics of the dataset by combining the literature review and industrial survey findings. The results of the systematic literature review, industrial survey and an initial investigation, have led to an understanding that dataset characteristics may influence the cost estimation prediction techniques. From this, an investigation was carried out on dataset characteristics. However, in the attempt to structure the characteristics of dataset it was found not to be practical or easy to get a defined structure of dataset characteristics to use as a basis for prediction model selection. Therefore the thesis develops a pragmatic cost estimation strategy based on collected advice and general sound practice in cost estimation. The strategy is composed of the following five steps: test whether the predictions are better than the means of the dataset; test the predictions using accuracy measures such as MMRE, Pred and MAE knowing their strengths and weaknesses; investigate the prediction models formed to see if they are sensible and reasonable model; perform significance testing on the predictions; and get the effect size to establish preference relations of prediction models. The results from this pragmatic cost estimation strategy give not only advice on several techniques to choose from, but also give reliable results. Practitioners can be more confident about the estimation that is given by following this pragmatic cost estimation strategy. It can be concluded that the practitioners should focus on the best strategy to apply in cost estimation rather than focusing on the best techniques. Therefore, this pragmatic cost estimation strategy could help researchers and practitioners to get reliable results. The improvement and replication of this strategy over time will produce much more useful and trusted results.Cost estimation for web applications is an interesting and difficult challenge for researchers and industrial practitioners. It is a particularly valuable area of ongoing commercial research. Attaining on accurate cost estimation for web applications is an essential element in being able to provide competitive bids and remaining successful in the market. The development of prediction techniques over thirty years ago has contributed to several different strategies. Unfortunately there is no collective evidence to give substantial advice or guidance for industrial practitioners. Therefore to address this problem, this thesis shows the way by investigating the characteristics of the dataset by combining the literature review and industrial survey findings. The results of the systematic literature review, industrial survey and an initial investigation, have led to an understanding that dataset characteristics may influence the cost estimation prediction techniques. From this, an investigation was carried out on dataset characteristics. However, in the attempt to structure the characteristics of dataset it was found not to be practical or easy to get a defined structure of dataset characteristics to use as a basis for prediction model selection. Therefore the thesis develops a pragmatic cost estimation strategy based on collected advice and general sound practice in cost estimation. The strategy is composed of the following five steps: test whether the predictions are better than the means of the dataset; test the predictions using accuracy measures such as MMRE, Pred and MAE knowing their strengths and weaknesses; investigate the prediction models formed to see if they are sensible and reasonable model; perform significance testing on the predictions; and get the effect size to establish preference relations of prediction models. The results from this pragmatic cost estimation strategy give not only advice on several techniques to choose from, but also give reliable results. Practitioners can be more confident about the estimation that is given by following this pragmatic cost estimation strategy. It can be concluded that the practitioners should focus on the best strategy to apply in cost estimation rather than focusing on the best techniques. Therefore, this pragmatic cost estimation strategy could help researchers and practitioners to get reliable results. The improvement and replication of this strategy over time will produce much more useful and trusted results

    Rethinking Productivity in Software Engineering

    Get PDF
    Get the most out of this foundational reference and improve the productivity of your software teams. This open access book collects the wisdom of the 2017 "Dagstuhl" seminar on productivity in software engineering, a meeting of community leaders, who came together with the goal of rethinking traditional definitions and measures of productivity. The results of their work, Rethinking Productivity in Software Engineering, includes chapters covering definitions and core concepts related to productivity, guidelines for measuring productivity in specific contexts, best practices and pitfalls, and theories and open questions on productivity. You'll benefit from the many short chapters, each offering a focused discussion on one aspect of productivity in software engineering. Readers in many fields and industries will benefit from their collected work. Developers wanting to improve their personal productivity, will learn effective strategies for overcoming common issues that interfere with progress. Organizations thinking about building internal programs for measuring productivity of programmers and teams will learn best practices from industry and researchers in measuring productivity. And researchers can leverage the conceptual frameworks and rich body of literature in the book to effectively pursue new research directions. What You'll Learn Review the definitions and dimensions of software productivity See how time management is having the opposite of the intended effect Develop valuable dashboards Understand the impact of sensors on productivity Avoid software development waste Work with human-centered methods to measure productivity Look at the intersection of neuroscience and productivity Manage interruptions and context-switching Who Book Is For Industry developers and those responsible for seminar-style courses that include a segment on software developer productivity. Chapters are written for a generalist audience, without excessive use of technical terminology. ; Collects the wisdom of software engineering thought leaders in a form digestible for any developer Shares hard-won best practices and pitfalls to avoid An up to date look at current practices in software engineering productivit

    A Review: Effort Estimation Model for Scrum Projects using Supervised Learning

    Get PDF
    Effort estimation practice in Agile is a critical component of the methodology to help cross-functional teams to plan and prioritize their work. Agile approaches have emerged in recent years as a more adaptable means of creating software projects because they consistently produce a workable end product that is developed progressively, preventing projects from failing entirely. Agile software development enables teams to collaborate directly with clients and swiftly adjust to changing requirements. This produces a result that is distinct, gradual, and targeted. It has been noted that the present Scrum estimate approach heavily relies on historical data from previous projects and expert opinion, while existing agile estimation methods like analogy and planning poker become unpredictable in the absence of historical data and experts. User Stories are used to estimate effort in the Agile approach, which has been adopted by 60–70% of the software businesses. This study's goal is to review a variety of strategies and techniques that will be used to gauge and forecast effort. Additionally, the supervised machine learning method most suited for predictive analysis is reviewed in this paper

    Data analysis in software engineering: an approach to incremental data-driven effort estimation.

    Get PDF
    Cost and effort estimation in software projects have been investigated for several years. Nonetheless, compared to other engineering fields, there is still a large number of projects that fail into different phases due to prediction errors. On average, large IT projects run 45 percent over budget and seven percent over time, while delivering 56 percent less value than predicted. Several effort estimation models have been defined in the past, mainly based on user experience or on data collected in previous projects, but no studies support an incremental effort estimation and tracking. Iterative development techniques, and in particular Agile techniques, partially support the incremental effort estimation, but due to the complexity of the estimation, the total effort always tend to be higher than expected. Therefore, this work focuses on defining an adequate incremental and data driven estimation model so as to support developers and project managers to keep track of the remaining effort incrementally. The result of this work is a set of estimation models for effort estimation, based on a set of context factors, such as the domain of application developed, size of the project team and other characteristics. Moreover, in this work we do not aim at defining a model with generic parameters to be applied in similar context, but we define a mathematical approach so as to customize the model for each development team. The first step of this work focused on analysis of the existing estimation models and collection of evidence on the accuracy of each model. We then defined our approach based on Ordinary Least Squares regression analysis (OLS)so as to investigate the existence of a correlation between the actual effort and other characteristics. While building the OLS models we analyzed the data set and removed the outliers to prevent them from unduly influencing the OLS regression lines obtained. In order to validate the result we apply a 10-fold cross-validation assessing the accuracy of the results in terms of R2, MRE and MdMRE. The model has been applied to two different case studies. First, we analyzed a large number of projects developed by means of the waterfall process. Then, we analyzed an Agile process, so as to understand if the developed model is also applicable to agile methodologies. In the first case study we want to understand if we can define an effort estimation model to predict the effort of the next development phase based on the effort already spent. For this reason, we investigated if it is possible to use: \u2022 the effort of one phase for estimating the effort of the next development phase \u2022 the effort of one phase for estimating the remaining project effort \u2022 the effort spent up to a development phase to estimate its effort \u2022 the effort spent up to a development phase to estimate the remaining project effort Then, we investigated if the prediction accuracy can be improved considering other common context factors such as project domain, development language, development platform, development process, programming language and number of Function Points. We analyzed projects collected in the ISBSG dataset and, considering the different context factors available, we run a total of 4500 analysis, to understand which are the more suitable factors to be applied in a specific context. The results of this first case study show a set of statistically significant correlations between: (1) the effort spent in one phase and the effort spent in the following one; (2) the effort spent in a phase and the remaining effort; (3) the cumulative effort up to the current phase and the remaining effort. However, the results also show that these estimation models come with different degrees of goodness of fit. Finally, including further information, such as the functional size, does not significantly improve estimation quality. In the second case study, a project developed with an agile methodology (SCRUM) has been analyzed. In this case, we want to understand if is possible to use our estimation approach, so as to help developers to increase the accuracy of the expert based estimation. SCRUM, effort estimations are carried out at the beginning of each sprint, usually based on story points. The usage of functional size measures, specifically selected for the type of application and development conditions, is expected to allow for more accurate effort estimates. The goal of the work presented here is to verify this hypothesis, based on experimental data. The association of story measures to actual effort and the accuracy of the resulting effort model is evaluated. The study shows that developers\u2019 estimation is more accurate than those based on functional measurement. In conclusion, our study shows that, easy to collect functional measures do not help developers in improving the accuracy of the effort estimation in Moonlight SCRUM. These models derived in our work can be used by project managers and developers that need to estimate or control the project effort in a development process. These models can also be used by the developers to track their performances and understand the reasons of effort estimation errors. Finally the model help project managers to react as soon as possible and reduce project failures due to estimation errors. The detailed results are reported in the next sections as follows: \u2022 Chapter 1 reports the introduction to this work \u2022 Chapter 2 reports the related literature review on effort estimation techniques \u2022 Chapter 3 reports the proposed effort estimation approach \u2022 Chapter 4 describe the application of our approach to Waterfall process \u2022 Chapter 5 describe the application of our approach to SCRUM \u2022 Chapter 6 reports the conclusion and the future work

    The usage of ISBSG data fields in software effort estimation: A systematic mapping study

    Full text link
    [EN] The International Software Benchmarking Standards Group (ISBSG) maintains a repository of data about completed software projects. A common use of the ISBSG dataset is to investigate models to estimate a software project's size, effort, duration, and cost. The aim of this paper is to determine which and to what extent variables in the ISBSG dataset have been used in software engineering to build effort estimation models. For that purpose a systematic mapping study was applied to 107 research papers, obtained after a filtering process, that were published from 2000 until the end of 2013, and which listed the independent variables used in the effort estimation models. The usage of ISBSG variables for filtering, as dependent variables, and as independent variables is described. The 20 variables (out of 71) mostly used as independent variables for effort estimation are identified and analysed in detail, with reference to the papers and types of estimation methods that used them. We propose guidelines that can help researchers make informed decisions about which ISBSG variables to select for their effort estimation models.González-Ladrón-De-Guevara, F.; Fernández-Diego, M.; Lokan, C. (2016). The usage of ISBSG data fields in software effort estimation: A systematic mapping study. Journal of Systems and Software. 113:188-215. doi:10.1016/j.jss.2015.11.040S18821511

    Schätzwerterfüllung in Softwareentwicklungsprojekten

    Get PDF
    Effort estimates are of utmost economic importance in software development projects. Estimates bridge the gap between managers and the invisible and almost artistic domain of developers. They give a means to managers to track and control projects. Consequently, numerous estimation approaches have been developed over the past decades, starting with Allan Albrecht's Function Point Analysis in the late 1970s. However, this work neither tries to develop just another estimation approach, nor focuses on improving accuracy of existing techniques. Instead of characterizing software development as a technological problem, this work understands software development as a sociological challenge. Consequently, this work focuses on the question, what happens when developers are confronted with estimates representing the major instrument of management control? Do estimates influence developers, or are they unaffected? Is it irrational to expect that developers start to communicate and discuss estimates, conform to them, work strategically, hide progress or delay? This study shows that it is inappropriate to assume an independency of estimated and actual development effort. A theory is developed and tested, that explains how developers and managers influence the relationship between estimated and actual development effort. The theory therefore elaborates the phenomenon of estimation fulfillment.Schätzwerte in Softwareentwicklungsprojekten sind von besonderer ökonomischer Wichtigkeit. Sie überbrücken die Lücke zwischen Projektleitern und der unsichtbaren und beinahe künstlerischen Domäne der Entwickler. Sie stellen ein Instrument dar, welches erlaubt, Projekte zu verfolgen und zu kontrollieren. Daher wurden in den vergangenen vier Jahrzehnten diverse Schätzverfahren entwickelt, beginnend mit der "Function Point" Analyse von Allan Albrecht. Diese Arbeit versucht allerdings weder ein neues Schätzverfahren zu entwickeln noch bestehende Verfahren zu verbessern. Anstatt Softwareentwicklung als technologisches Problem zu charakterisieren, wird in dieser Arbeit eine soziologische Perspektive genutzt. Dementsprechend fokussiert diese Arbeit die Frage, was passiert, wenn Entwickler mit Schätzwerten konfrontiert werden, die das wichtigste Kontrollinstrument des Managements darstellen? Lassen sich Entwickler von diesen Werten beeinflussen oder bleiben sie davon unberührt? Wäre es irrational, zu erwarten, dass Entwickler Schätzwerte kommunizieren, diese diskutieren, sich diesen anpassen, strategisch arbeiten sowie Verzögerungen verschleiern? Die vorliegende Studie zeigt, dass die Unabhängigkeitsannahme von Schätzwerten und tatsächlichem Entwicklungsaufwand unbegründet ist. Es wird eine Theorie entwickelt, welche erklärt, wie Entwickler und Projektleiter die Beziehung von Schätzungen und Aufwand beeinflussen und dass das Phänomen der Schätzwerterfüllung auftreten kann

    Rethinking Productivity in Software Engineering

    Get PDF
    Get the most out of this foundational reference and improve the productivity of your software teams. This open access book collects the wisdom of the 2017 "Dagstuhl" seminar on productivity in software engineering, a meeting of community leaders, who came together with the goal of rethinking traditional definitions and measures of productivity. The results of their work, Rethinking Productivity in Software Engineering, includes chapters covering definitions and core concepts related to productivity, guidelines for measuring productivity in specific contexts, best practices and pitfalls, and theories and open questions on productivity. You'll benefit from the many short chapters, each offering a focused discussion on one aspect of productivity in software engineering. Readers in many fields and industries will benefit from their collected work. Developers wanting to improve their personal productivity, will learn effective strategies for overcoming common issues that interfere with progress. Organizations thinking about building internal programs for measuring productivity of programmers and teams will learn best practices from industry and researchers in measuring productivity. And researchers can leverage the conceptual frameworks and rich body of literature in the book to effectively pursue new research directions. What You'll Learn Review the definitions and dimensions of software productivity See how time management is having the opposite of the intended effect Develop valuable dashboards Understand the impact of sensors on productivity Avoid software development waste Work with human-centered methods to measure productivity Look at the intersection of neuroscience and productivity Manage interruptions and context-switching Who Book Is For Industry developers and those responsible for seminar-style courses that include a segment on software developer productivity. Chapters are written for a generalist audience, without excessive use of technical terminology. ; Collects the wisdom of software engineering thought leaders in a form digestible for any developer Shares hard-won best practices and pitfalls to avoid An up to date look at current practices in software engineering productivit
    corecore