11,597 research outputs found

    Enhancing Software Project Outcomes: Using Machine Learning and Open Source Data to Employ Software Project Performance Determinants

    Get PDF
    Many factors can influence the ongoing management and execution of technology projects. Some of these elements are known a priori during the project planning phase. Others require real-time data gathering and analysis throughout the lifetime of a project. These real-time project data elements are often neglected, misclassified, or otherwise misinterpreted during the project execution phase resulting in increased risk of delays, quality issues, and missed business opportunities. The overarching motivation for this research endeavor is to offer reliable improvements in software technology management and delivery. The primary purpose is to discover and analyze the impact, role, and level of influence of various project related data on the ongoing management of technology projects. The study leverages open source data regarding software performance attributes. The goal is to temper the subjectivity currently used by project managers (PMs) with quantifiable measures when assessing project execution progress. Modern-day PMs who manage software development projects are charged with an arduous task. Often, they obtain their inputs from technical leads who tend to be significantly more technical. When assessing software projects, PMs perform their role subject to the limitations of their capabilities and competencies. PMs are required to contend with the stresses of the business environment, the policies, and procedures dictated by their organizations, and resource constraints. The second purpose of this research study is to propose methods by which conventional project assessment processes can be enhanced using quantitative methods that utilize real-time project execution data. Transferability of academic research to industry application is specifically addressed vis-Ă -vis a delivery framework to provide meaningful data to industry practitioners

    Integration of Industry 4.0 technologies into Lean Six Sigma DMAIC: a systematic review

    Get PDF
    This review examines which Industry 4.0 (I4.0) technologies are suitable for improving Lean Six Sigma (LSS) tasks and the benefits of integrating these technologies into improvement projects. Also, it explores existing integration frameworks and discusses their relevance. A quantitative analysis of 692 papers and an in-depth analysis of 41 papers revealed that “Analyse” is by far the best-supported DMAICs phase through techniques such as Data Mining, Machine Learning, Big Data Analytics, Internet of Things, and Process Mining. This paper also proposes a DMAIC 4.0 framework based on multiple technologies. The mapping of I4.0 related techniques to DMAIC phases and tools is a novelty compared to previous studies regarding the diversity of digital technologies applied. LSS practitioners facing the challenges of increasing complexity and data volumes can benefit from understanding how I4.0 technology can support their DMAIC projects and which of the suggested approaches they can adopt for their context

    Genetic Programming as Alternative for Predicting Development Effort of Individual Software Projects

    Get PDF
    Abstract Statistical and genetic programming techniques have been used to predict the software development effort of large software projects. In this paper, a genetic programming model was used for predicting the effort required in individually developed projects. Accuracy obtained from a genetic programming model was compared against one generated from the application of a statistical regression model. A sample of 219 projects developed by 71 practitioners was used for generating the two models, whereas another sample of 130 projects developed by 38 practitioners was used for validating them. The models used two kinds of lines of code as well as programming language experience as independent variables. Accuracy results from the model obtained with genetic programming suggest that it could be used to predict the software development effort of individual projects when these projects have been developed in a disciplined manner within a development-controlled environment

    Applying Absolute Residuals as Evaluation Criterion for Estimating the Development Time of Software Projects by Means of a Neuro-Fuzzy Approach

    Get PDF
    In the software development field, software practitioners expend between 30% and 40% more effort than is predicted. Accordingly, researchers have proposed new models for estimating the development effort such that the estimations of these models are close to actual ones. In this study, an application based on a new neuro-fuzzy system (NFS) is analyzed. The NFS accuracy was compared to that of a statistical multiple linear regression (MLR) model. The criterion for evaluating the accuracy of estimation models has mainly been the Magnitude of Relative Error (MRE), however, it was recently found that MRE is asymmetric, and the use of Absolute Residuals (AR) has been proposed, therefore, in this study, the accuracy results of the NFS and MLR were based on AR. After a statistical paired t-test was performed, results showed that accuracy of the New-NFS is statistically better than that of the MLR at the 99% confidence level. It can be concluded that a new-NFS could be used for predicting the effort of software development projects when they have been individually developed on a disciplined process.In the software development field, software practitioners expend between 30% and 40% more effort than is predicted. Accordingly, researchers have proposed new models for estimating the development effort such that the estimations of these models are close to actual ones. In this study, an application based on a new neuro-fuzzy system (NFS) is analyzed. The NFS accuracy was compared to that of a statistical multiple linear regression (MLR) model. The criterion for evaluating the accuracy of estimation models has mainly been the Magnitude of Relative Error (MRE), however, it was recently found that MRE is asymmetric, and the use of Absolute Residuals (AR) has been proposed, therefore, in this study, the accuracy results of the NFS and MLR were based on AR. After a statistical paired t-test was performed, results showed that accuracy of the New-NFS is statistically better than that of the MLR at the 99% confidence level. It can be concluded that a new-NFS could be used for predicting the effort of software development projects when they have been individually developed on a disciplined process

    An Empirical investigation into software effort estimation by analogy

    Get PDF
    Most practitioners recognise the important part accurate estimates of development effort play in the successful management of major software projects. However, it is widely recognised that current estimation techniques are often very inaccurate, while studies (Heemstra 1992; Lederer and Prasad 1993) have shown that effort estimation research is not being effectively transferred from the research domain into practical application. Traditionally, research has been almost exclusively focused on the advancement of algorithmic models (e.g. COCOMO (Boehm 1981) and SLIM (Putnam 1978)), where effort is commonly expressed as a function of system size. However, in recent years there has been a discernible movement away from algorithmic models with non-algorithmic systems (often encompassing machine learning facets) being actively researched. This is potentially a very exciting and important time in this field, with new approaches regularly being proposed. One such technique, estimation by analogy, is the focus of this thesis. The principle behind estimation by analogy is that past experience can often provide insights and solutions to present problems. Software projects are characterised in terms of collectable features (such as the number of screens or the size of the functional requirements) and stored in a historical case base as they are completed. Once a case base of sufficient size has been cultivated, new projects can be estimated by finding similar historical projects and re-using the recorded effort. To make estimation by analogy feasible it became necessary to construct a software tool, dubbed ANGEL, which allowed the collection of historical project data and the generation of estimates for new software projects. A substantial empirical validation of the approach was made encompassing approximately 250 real historical software projects across eight industrial data sets, using stepwise regression as a benchmark. Significance tests on the results accepted the hypothesis (at the 1% confidence level) that estimation by analogy is a superior prediction system to stepwise regression in terms of accuracy. A study was also made of the sensitivity of the analogy approach. By growing project data sets in a pseudo time-series fashion it was possible to answer pertinent questions about the approach, such as, what are the effects of outlying projects and what is the minimum data set size? The main conclusions of this work are that estimation by analogy is a viable estimation technique that would seem to offer some advantages over algorithmic approaches including, improved accuracy, easier use of categorical features and an ability to operate even where no statistical relationships can be found

    A Predictive Model for Scaffolding Manhours in Heavy Industrial Construction Projects: An application of machine learning

    Get PDF
    In cold countries like Canada, modular construction is widely adopted in heavy industrial construction projects due to weather uncertainties. To facilitate the construction processes, the temporary structures, especially scaffolding, are essential since it provides easy access for workers to carry out construction activities at different levels of the height and also ensures the safety of laborers. As indirect costs of projects, the scaffolding is estimated by 15-40% of project costs. Furthermore, according to the increase in the size of the projects, the scaffolding uses a larger amount of resources than estimated ones, which may cause budget overrun and schedule delay. However, due to the lack of systematic and scientific models to estimate scaffolding productivity, the heavy industrial company has difficulty to plan and allocate the resources for scaffold activities before construction. To overcome these challenges, this paper proposes a predictive model to estimate scaffolding productivity based on the historical scaffolding data of a heavy industrial project. The proposed model is developed based on the following steps: (i) identifying the key parameters (e.g. specific trades, work type, different scaffold methods, task times spent using scaffolds, and weights of the scaffolds) that influence the scaffolding manhours and project productivity; and (ii) developing the predictive models for scaffold manhours using machine learning algorithms including multiple linear regression, decision tree regression, random forest regression and artificial neural networks(ANN) . The accuracy of models have been measured with evaluation metrics which are mean absolute error (MAE) and root mean squared error (RMSE) and the R squared value. The findings reveal up to 90% accuracy for ANN models

    OpenML: networked science in machine learning

    Full text link
    Many sciences have made significant breakthroughs by adopting online tools that help organize, structure and mine information that is too detailed to be printed in journals. In this paper, we introduce OpenML, a place for machine learning researchers to share and organize data in fine detail, so that they can work more effectively, be more visible, and collaborate with others to tackle harder problems. We discuss how OpenML relates to other examples of networked science and what benefits it brings for machine learning research, individual scientists, as well as students and practitioners.Comment: 12 pages, 10 figure
    • …
    corecore