11 research outputs found

    A Bibliometric Survey on the Reliable Software Delivery Using Predictive Analysis

    Get PDF
    Delivering a reliable software product is a fairly complex process, which involves proper coordination from the various teams in planning, execution, and testing for delivering software. Most of the development time and the software budget\u27s cost is getting spent finding and fixing bugs. Rework and side effect costs are mostly not visible in the planned estimates, caused by inherent bugs in the modified code, which impact the software delivery timeline and increase the cost. Artificial intelligence advancements can predict the probable defects with classification based on the software code changes, helping the software development team make rational decisions. Optimizing the software cost and improving the software quality is the topmost priority of the industry to remain profitable in the competitive market. Hence, there is a great urge to improve software delivery quality by minimizing defects and having reasonable control over predicted defects. This paper presents the bibliometric study for Reliable Software Delivery using Predictive analysis by selecting 450 documents from the Scopus database, choosing keywords like software defect prediction, machine learning, and artificial intelligence. The study is conducted for a year starting from 2010 to 2021. As per the survey, it is observed that Software defect prediction achieved an excellent focus among the researchers. There are great possibilities to predict and improve overall software product quality using artificial intelligence techniques

    Feature reduction and selection for use in machine learning for manufacturing

    Get PDF
    In a complex manufacturing system such as the multistage manufacturing system, maintaining the quality of the products becomes a challenging task. It is due to the interconnectivity and dependency of factors that can affect the final product. With the increasing availability of data, Machine Learning (ML) approaches are applied to assess and predict quality-related issues. In this paper, several ML algorithms, including feature reduction/selection methods, were applied to a publicly available multistage manufacturing dataset to predict the characteristic of the output measurements in (mm). A total of 24 prediction models were produced. The accuracy of the prediction models and the execution time were the evaluation metrics. The results show that uncontrolled variables are the most common features that have been selected by the selection/reduction methods suggesting their strong relationship to the quality of the product. The performance of the prediction models was heavily dependent on the ML algorithm

    Statistical Analysis for Revealing Defects in Software Projects

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Information Systems and Technologies ManagementDefect detection in software is the procedure to identify parts of software that may comprise defects. Software companies always seek to improve the performance of software projects in terms of quality and efficiency. They also seek to deliver the soft-ware projects without any defects to the communities and just in time. The early revelation of defects in software projects is also tried to avoid failure of those projects, save costs, team effort, and time. Therefore, these companies need to build an intelligent model capable of detecting software defects accurately and efficiently. This study seeks to achieve two main objectives. The first goal is to build a statistical model to identify the critical defect factors that influence software projects. The second objective is to build a statistical model to reveal defects early in software pro-jects as reasonable accurately. A bibliometric map (VOSviewer) was used to find the relationships between the common terms in those domains. The results of this study are divided into three parts: In the first part The term "software engineering" is connected to "cluster," "regression," and "neural network." Moreover, the terms "random forest" and "feature selection" are connected to "neural network," "recall," and "software engineering," "cluster," "regression," and "fault prediction model" and "software defect prediction" and "defect density." In the second part We have checked and analyzed 29 manuscripts in detail, summarized their major contributions, and identified a few research gaps. In the third part Finally, software companies try to find the critical factors that affect the detection of software defects and find any of the intelligent or statistical methods that help to build a model capable of detecting those defects with high accuracy. Two statistical models (Multiple linear regression (MLR) and logistic regression (LR)) were used to find the critical factors and through them to detect software defects accurately. MLR is executed by using two methods which are critical defect factors (CDF) and premier list of software defect factors (PLSDF). The accuracy of MLR-CDF and MLR-PLSDF is 82.3 and 79.9 respectively. The standard error of MLR-CDF and MLR-PLSDF is 26% and 28% respectively. In addition, LR is executed by using two methods which are CDF and PLSDF. The accuracy of LR-CDF and LR-PLSDF is 86.4 and 83.8 respectively. The standard error of LR-CDF and LR-PLSDF is 22% and 25% respectively. Therefore, LRCDF outperforms on all the proposed models and state-of-the-art methods in terms of accuracy and standard error

    Learning Very Large Configuration Spaces: What Matters for Linux Kernel Sizes

    Get PDF
    Linux kernels are used in a wide variety of appliances, many of them having strong requirements on the kernel size due to constraints such as limited memory or instant boot. With more than ten thousands of configuration options to choose from, obtaining a suitable trade off between kernel size and functionality is an extremely hard problem. Developers, contributors, and users actually spend significant effort to document, understand, and eventually tune (combinations of) options for meeting a kernel size. In this paper, we investigate how machine learning can help explain what matters for predicting a given Linux kernel size. Unveiling what matters in such very large configuration space is challenging for two reasons: (1) whatever the time we spend on it, we can only build and measure a tiny fraction of possible kernel configurations; (2) the prediction model should be both accurate and interpretable. We compare different machine learning algorithms and demonstrate the benefits of specific feature encoding and selection methods to learn an accurate model that is fast to compute and simple to interpret. Our results are validated over 95,854 kernel configurations and show that we can achieve low prediction errors over a reduced set of options. We also show that we can extract interpretable information for refining documentation and experts' knowledge of Linux, or even assigning more sensible default values to options

    A Technique for Evaluating the Health Status of a Software Module Using Process Metrics

    Get PDF
    Identifying error-prone files in large software systems has been an area where significant attention has been paid over the years. In this thesis, we propose a process-metrics based method for predicting the health status of a file based on its commit profile in its GitHub repository. Precisely, for each file and each bug fixing commit a file participates, we compute a dependency score of the committed file with its other co-committed files. The calculated score is appropriately decayed if the file does not participate in the new bug-fixing commits in the system. By examining the trend of the dependency score for each file as time elapses, we try to deduce whether this file will be participating in a bug fixing commit in the immediately following commits. This approach has been evaluated in 21 medium to large open-source systems by correlating the dependency metric trend against the known outcome obtained from a data set we use as a gold standard
    corecore