86 research outputs found

    Data analysis in software engineering: an approach to incremental data-driven effort estimation.

    Get PDF
    Cost and effort estimation in software projects have been investigated for several years. Nonetheless, compared to other engineering fields, there is still a large number of projects that fail into different phases due to prediction errors. On average, large IT projects run 45 percent over budget and seven percent over time, while delivering 56 percent less value than predicted. Several effort estimation models have been defined in the past, mainly based on user experience or on data collected in previous projects, but no studies support an incremental effort estimation and tracking. Iterative development techniques, and in particular Agile techniques, partially support the incremental effort estimation, but due to the complexity of the estimation, the total effort always tend to be higher than expected. Therefore, this work focuses on defining an adequate incremental and data driven estimation model so as to support developers and project managers to keep track of the remaining effort incrementally. The result of this work is a set of estimation models for effort estimation, based on a set of context factors, such as the domain of application developed, size of the project team and other characteristics. Moreover, in this work we do not aim at defining a model with generic parameters to be applied in similar context, but we define a mathematical approach so as to customize the model for each development team. The first step of this work focused on analysis of the existing estimation models and collection of evidence on the accuracy of each model. We then defined our approach based on Ordinary Least Squares regression analysis (OLS)so as to investigate the existence of a correlation between the actual effort and other characteristics. While building the OLS models we analyzed the data set and removed the outliers to prevent them from unduly influencing the OLS regression lines obtained. In order to validate the result we apply a 10-fold cross-validation assessing the accuracy of the results in terms of R2, MRE and MdMRE. The model has been applied to two different case studies. First, we analyzed a large number of projects developed by means of the waterfall process. Then, we analyzed an Agile process, so as to understand if the developed model is also applicable to agile methodologies. In the first case study we want to understand if we can define an effort estimation model to predict the effort of the next development phase based on the effort already spent. For this reason, we investigated if it is possible to use: \u2022 the effort of one phase for estimating the effort of the next development phase \u2022 the effort of one phase for estimating the remaining project effort \u2022 the effort spent up to a development phase to estimate its effort \u2022 the effort spent up to a development phase to estimate the remaining project effort Then, we investigated if the prediction accuracy can be improved considering other common context factors such as project domain, development language, development platform, development process, programming language and number of Function Points. We analyzed projects collected in the ISBSG dataset and, considering the different context factors available, we run a total of 4500 analysis, to understand which are the more suitable factors to be applied in a specific context. The results of this first case study show a set of statistically significant correlations between: (1) the effort spent in one phase and the effort spent in the following one; (2) the effort spent in a phase and the remaining effort; (3) the cumulative effort up to the current phase and the remaining effort. However, the results also show that these estimation models come with different degrees of goodness of fit. Finally, including further information, such as the functional size, does not significantly improve estimation quality. In the second case study, a project developed with an agile methodology (SCRUM) has been analyzed. In this case, we want to understand if is possible to use our estimation approach, so as to help developers to increase the accuracy of the expert based estimation. SCRUM, effort estimations are carried out at the beginning of each sprint, usually based on story points. The usage of functional size measures, specifically selected for the type of application and development conditions, is expected to allow for more accurate effort estimates. The goal of the work presented here is to verify this hypothesis, based on experimental data. The association of story measures to actual effort and the accuracy of the resulting effort model is evaluated. The study shows that developers\u2019 estimation is more accurate than those based on functional measurement. In conclusion, our study shows that, easy to collect functional measures do not help developers in improving the accuracy of the effort estimation in Moonlight SCRUM. These models derived in our work can be used by project managers and developers that need to estimate or control the project effort in a development process. These models can also be used by the developers to track their performances and understand the reasons of effort estimation errors. Finally the model help project managers to react as soon as possible and reduce project failures due to estimation errors. The detailed results are reported in the next sections as follows: \u2022 Chapter 1 reports the introduction to this work \u2022 Chapter 2 reports the related literature review on effort estimation techniques \u2022 Chapter 3 reports the proposed effort estimation approach \u2022 Chapter 4 describe the application of our approach to Waterfall process \u2022 Chapter 5 describe the application of our approach to SCRUM \u2022 Chapter 6 reports the conclusion and the future work

    Open Tracing Tools: Overview and Critical Comparison

    Get PDF
    Background. Coping with the rapid growing complexity in contemporary software architecture, tracing has become an increasingly critical practice and been adopted widely by software engineers. By adopting tracing tools, practitioners are able to monitor, debug, and optimize distributed software architectures easily. However, with excessive number of valid candidates, researchers and practitioners have a hard time finding and selecting the suitable tracing tools by systematically considering their features and advantages. Objective. To such a purpose, this paper aims to provide an overview of the popular tracing tools on the market via a critical comparison. Method. Herein, we first identified 11 tools in an objective, systematic, and reproducible manner adopting the Systematic Multivocal Literature Review protocol. Then, we characterized each tool looking at the 1) measured features, 2) popularity both in peer-reviewed literature and online media, and 3) benefits and issues. Results. As a result, this paper presents a systematic comparison amongst the selected tracing tools in terms of their features, popularity, benefits and issues. Conclusion. Such a result mainly shows that each tracing tool provides a unique combination of features with also different pros and cons. The contribution of this paper is to provide the practitioners better understanding of the tracing tools facilitating their adoption

    Technical Debt Prioritization: State of the Art. A Systematic Literature Review

    Get PDF
    Background. Software companies need to manage and refactor Technical Debt issues. Therefore, it is necessary to understand if and when refactoring Technical Debt should be prioritized with respect to developing features or fixing bugs. Objective. The goal of this study is to investigate the existing body of knowledge in software engineering to understand what Technical Debt prioritization approaches have been proposed in research and industry. Method. We conducted a Systematic Literature Review among 384 unique papers published until 2018, following a consolidated methodology applied in Software Engineering. We included 38 primary studies. Results. Different approaches have been proposed for Technical Debt prioritization, all having different goals and optimizing on different criteria. The proposed measures capture only a small part of the plethora of factors used to prioritize Technical Debt qualitatively in practice. We report an impact map of such factors. However, there is a lack of empirical and validated set of tools. Conclusion. We observed that technical Debt prioritization research is preliminary and there is no consensus on what are the important factors and how to measure them. Consequently, we cannot consider current research conclusive and in this paper, we outline different directions for necessary future investigations

    Does Cyclomatic or Cognitive Complexity Better Represents Code Understandability? An Empirical Investigation on the Developers Perception

    Full text link
    Background. Code understandability is fundamental. Developers need to clearly understand the code they are modifying. A low understandability can increase the amount of coding effort and misinterpretation of code has impact on the entire development process. Ideally, developers should write clear and understandable code with the least possible effort. Objective. The goal of this work is to investigate if the McCabe Cyclomatic Complexity or the Cognitive Complexity can be a good predictor for the developers' perceived code understandability to understand which of the two complexities can be used as criteria to evaluate if a piece of code is understandable. Method. We designed and conducted an empirical study among 216 junior developers with professional experience ranging from one to four years. We asked them to manually inspect and rate the understandability of 12 Java classes that exhibit different levels of Cyclomatic and Cognitive Complexity. Results. Cognitive Complexity slightly outperforms the Cyclomatic Complexity to predict the developers' perceived understandability. Conclusion. The identification of a clear and validated measure for Code Complexity is still an open issue. Neither the old fashioned McCabe Cyclomatic Complexity and the most recent Cognitive Complexity are good predictors for code understandability, at least when considering the complexity perceived by junior developers
    • …
    corecore