554 research outputs found

    A Bibliometric Survey on the Reliable Software Delivery Using Predictive Analysis

    Get PDF
    Delivering a reliable software product is a fairly complex process, which involves proper coordination from the various teams in planning, execution, and testing for delivering software. Most of the development time and the software budget\u27s cost is getting spent finding and fixing bugs. Rework and side effect costs are mostly not visible in the planned estimates, caused by inherent bugs in the modified code, which impact the software delivery timeline and increase the cost. Artificial intelligence advancements can predict the probable defects with classification based on the software code changes, helping the software development team make rational decisions. Optimizing the software cost and improving the software quality is the topmost priority of the industry to remain profitable in the competitive market. Hence, there is a great urge to improve software delivery quality by minimizing defects and having reasonable control over predicted defects. This paper presents the bibliometric study for Reliable Software Delivery using Predictive analysis by selecting 450 documents from the Scopus database, choosing keywords like software defect prediction, machine learning, and artificial intelligence. The study is conducted for a year starting from 2010 to 2021. As per the survey, it is observed that Software defect prediction achieved an excellent focus among the researchers. There are great possibilities to predict and improve overall software product quality using artificial intelligence techniques

    Deep Incremental Learning of Imbalanced Data for Just-In-Time Software Defect Prediction

    Full text link
    This work stems from three observations on prior Just-In-Time Software Defect Prediction (JIT-SDP) models. First, prior studies treat the JIT-SDP problem solely as a classification problem. Second, prior JIT-SDP studies do not consider that class balancing processing may change the underlying characteristics of software changeset data. Third, only a single source of concept drift, the class imbalance evolution is addressed in prior JIT-SDP incremental learning models. We propose an incremental learning framework called CPI-JIT for JIT-SDP. First, in addition to a classification modeling component, the framework includes a time-series forecast modeling component in order to learn temporal interdependent relationship in the changesets. Second, the framework features a purposefully designed over-sampling balancing technique based on SMOTE and Principal Curves called SMOTE-PC. SMOTE-PC preserves the underlying distribution of software changeset data. In this framework, we propose an incremental deep neural network model called DeepICP. Via an evaluation using \numprojs software projects, we show that: 1) SMOTE-PC improves the model's predictive performance; 2) to some software projects it can be beneficial for defect prediction to harness temporal interdependent relationship of software changesets; and 3) principal curves summarize the underlying distribution of changeset data and reveals a new source of concept drift that the DeepICP model is proposed to adapt to

    Reliability in open source software

    Get PDF
    Open Source Software is a component or an application whose source code is freely accessible and changeable by the users, subject to constraints expressed in a number of licensing modes. It implies a global alliance for developing quality software with quick bug fixing along with quick evolution of the software features. In the recent year tendency toward adoption of OSS in industrial projects has swiftly increased. Many commercial products use OSS in various fields such as embedded systems, web management systems, and mobile software’s. In addition to these, many OSSs are modified and adopted in software products. According to Netcarf survey more than 58% web servers are using an open source web server, Apache. The swift increase in the taking on of the open source technology is due to its availability, and affordability. Recent empirical research published by Forrester highlighted that although many European software companies have a clear OSS adoption strategy; there are fears and questions about the adoption. All these fears and concerns can be traced back to the quality and reliability of OSS. Reliability is one of the more important characteristics of software quality when considered for commercial use. It is defined as the probability of failure free operation of software for a specified period of time in a specified environment (IEEE Std. 1633-2008). While open source projects routinely provide information about community activity, number of developers and the number of users or downloads, this is not enough to convey information about reliability. Software reliability growth models (SRGM) are frequently used in the literature for the characterization of reliability in industrial software. These models assume that reliability grows after a defect has been detected and fixed. SRGM is a prominent class of software reliability models (SRM). SRM is a mathematical expression that specifies the general form of the software failure process as a function of factors such as fault introduction, fault removal, and the operational environment. Due to defect identification and removal the failure rate (failures per unit of time) of a software system generally decreases over time. Software reliability modeling is done to estimate the form of the curve of the failure rate by statistically estimating the parameters associated with the selected model. The purpose of this measure is twofold: 1) to estimate the extra test time required to meet a specified reliability objective and 2) to identify the expected reliability of the software after release (IEEE Std. 1633-2008). SRGM can be applied to guide the test board in their decision of whether to stop or continue the testing. These models are grouped into concave and S-Shaped models on the basis of assumption about cumulative failure occurrence pattern. The S-Shaped models assume that the occurrence pattern of cumulative number of failures is S-Shaped: initially the testers are not familiar with the product, then they become more familiar and hence there is a slow increase in fault removing. As the testers’ skills improve the rate of uncovering defects increases quickly and then levels off as the residual errors become more difficult to remove. In the concave shaped models the increase in failure intensity reaches a peak before a decrease in failure pattern is observed. Therefore the concave models indicate that the failure intensity is expected to decrease exponentially after a peak was reached. From exhaustive study of the literature I come across three research gaps: SRGM have widely been used for reliability characterization of closed source software (CSS), but 1) there is no universally applicable model that can be applied in all cases, 2) applicability of SRGM for OSS is unclear and 3) there is no agreement on how to select the best model among several alternative models, and no specific empirical methodologies have been proposed, especially for OSS. My PhD work mainly focuses on these three research gaps. In first step, focusing on the first research gap, I analyzed comparatively eight SRGM, including Musa Okumoto, Inflection S-Shaped, Geol Okumoto, Delayed S-Shaped, Logistic, Gompertz and Generalized Geol, in term of their fitting and prediction capabilities. These models have selected due to their wide spread use and they are the most representative in their category. For this study 38 failure datasets of 38 projects have been used. Among 38 projects, 6 were OSS and 32 were CSS. In 32 CSS datasets 22 were from testing phase and remaining 10 were from operational phase (i.e. field). The outcomes show that Musa Okumoto remains the best for CSS projects while Inflection S-Shaped and Gompertz remain best for OSS projects. Apart from that we observe that concave models outperform for CSS and S-Shaped outperform for OSS projects. In the second step, focusing on the second research gap, reliability growth of OSS projects was compared with that of CSS projects. For this purpose 25 OSS and 22 CSS projects were selected with related defect data. Eight SRGM were fitted to the defect data of selected projects and the reliability growth was analyzed with respect to fitted models. I found that the entire selected models fitted to OSS projects defect data in the same manner as that of CSS projects and hence it confirms that OSS projects reliability grows similarly to that of CSS projects. However, I observed that for OSS S-Shaped models outperform and for CSS concave shaped models outperform. To overcome the third research gap I proposed a method that selects the best SRGM among several alternative models for predicting the residuals of an OSS. The method helps the practitioners in deciding whether to adopt an OSS component, or not in a project. We test the method empirically by applying it to twenty one different releases of seven OSS projects. From the validation results it is clear that the method selects the best model 17 times out of 21. In the remaining four it selects the second best model

    Security assessment of open source third-parties applications

    Get PDF
    Free and Open Source Software (FOSS) components are ubiquitous in both proprietary and open source applications. In this dissertation we discuss challenges that large software vendors face when they must integrate and maintain FOSS components into their software supply chain. Each time a vulnerability is disclosed in a FOSS component, a software vendor must decide whether to update the component, patch the application itself, or just do nothing as the vulnerability is not applicable to the deployed version that may be old enough to be not vulnerable. This is particularly challenging for enterprise software vendors that consume thousands of FOSS components, and offer more than a decade of support and security fixes for applications that include these components. First, we design a framework for performing security vulnerability experimentations. In particular, for testing known exploits for publicly disclosed vulnerabilities against different versions and software configurations. Second, we provide an automatic screening test for quickly identifying the versions of FOSS components likely affected by newly disclosed vulnerabilities: a novel method that scans across the entire repository of a FOSS component in a matter of minutes. We show that our screening test scales to large open source projects. Finally, for facilitating the global security maintenance of a large portfolio of FOSS components, we discuss various characteristics of FOSS components and their potential impact on the security maintenance effort, and empirically identify the key drivers

    Improving Defect Prediction Models by Combining Classifiers Predicting Different Defects

    Get PDF
    Background: The software industry spends a lot of money on finding and fixing defects. It utilises software defect prediction models to identify code that is likely to be defective. Prediction models have, however, reached a performance bottleneck. Any improvements to prediction models would likely yield less defects-reducing costs for companies. Aim: In this dissertation I demonstrate that different families of classifiers find distinct subsets of defects. I show how this finding can be utilised to design ensemble models which outperform other state-of-the-art software defect prediction models. Method: This dissertation is supported by published work. In the first paper I explore the quality of data which is a prerequisite for building reliable software defect prediction models. The second and third papers explore the ability of different software defect prediction models to find distinct subsets of defects. The fourth paper explores how software defect prediction models can be improved by combining a collection of classifiers that predict different defective components into ensembles. An additional, non-published work, presents a visual technique for the analysis of predictions made by individual classifiers and discusses some possible constraints for classifiers used in software defect prediction. Result: Software defect prediction models created by classifiers of different families predict distinct subsets of defects. Ensembles composed of classifiers belonging to different families outperform other ensemble and standalone models. Only a few highly diverse and accurate base models are needed to compose an effective ensemble. This ensemble can consistently predict a greater number of defects compared to the increase in incorrect predictions. Conclusion: Ensembles should not use the majority-voting techniques to combine decisions of classifiers in software defect prediction as this will miss correct predictions of classifiers which uniquely identify defects. Some classifiers could be less successful for software defect prediction due to complex decision boundaries of defect data. Stacking based ensembles can outperform other ensemble and stand-alone techniques. I propose new possible avenues of research that could further improve the modelling of ensembles in software defect prediction. Data quality should be explicitly considered prior to experiments for researchers to establish reliable results

    Software Engineering 2021 : Fachtagung vom 22.-26. Februar 2021 Braunschweig/virtuell

    Get PDF

    Automatically Identifying Code Features for Software Defect Prediction: Using AST N-grams

    Get PDF
    Context: Identifying defects in code early is important. A wide range of static code metrics have been evaluated as potential defect indicators. Most of these metrics offer only high level insights and focus on particular pre-selected features of the code. None of the currently used metrics clearly performs best in defect prediction. Objective: We use Abstract Syntax Tree (AST) n-grams to identify features of defective Java code that improve defect prediction performance. Method: Our approach is bottom-up and does not rely on pre-selecting any specific features of code. We use non-parametric testing to determine relationships between AST n-grams and faults in both open source and commercial systems. We build defect prediction models using three machine learning techniques. Results: We show that AST n-grams are very significantly related to faults in some systems, with very large effect sizes. The occurrence of some frequently occurring AST n-grams in a method can mean that the method is up to three times more likely to contain a fault. AST n-grams can have a large effect on the performance of defect prediction models. Conclusions: We suggest that AST n-grams offer developers a promising approach to identifying potentially defective code

    Towards Understanding Fairness and its Composition in Ensemble Machine Learning

    Full text link
    Machine Learning (ML) software has been widely adopted in modern society, with reported fairness implications for minority groups based on race, sex, age, etc. Many recent works have proposed methods to measure and mitigate algorithmic bias in ML models. The existing approaches focus on single classifier-based ML models. However, real-world ML models are often composed of multiple independent or dependent learners in an ensemble (e.g., Random Forest), where the fairness composes in a non-trivial way. How does fairness compose in ensembles? What are the fairness impacts of the learners on the ultimate fairness of the ensemble? Can fair learners result in an unfair ensemble? Furthermore, studies have shown that hyperparameters influence the fairness of ML models. Ensemble hyperparameters are more complex since they affect how learners are combined in different categories of ensembles. Understanding the impact of ensemble hyperparameters on fairness will help programmers design fair ensembles. Today, we do not understand these fully for different ensemble algorithms. In this paper, we comprehensively study popular real-world ensembles: bagging, boosting, stacking and voting. We have developed a benchmark of 168 ensemble models collected from Kaggle on four popular fairness datasets. We use existing fairness metrics to understand the composition of fairness. Our results show that ensembles can be designed to be fairer without using mitigation techniques. We also identify the interplay between fairness composition and data characteristics to guide fair ensemble design. Finally, our benchmark can be leveraged for further research on fair ensembles. To the best of our knowledge, this is one of the first and largest studies on fairness composition in ensembles yet presented in the literature.Comment: Accepted at ICSE 202
    • …
    corecore