4,338 research outputs found

    Evidence-based defect assessment and prediction for software product lines

    Get PDF
    The systematic reuse provided by software product lines provides opportunities to achieve increased quality and reliability as a product line matures. This has led to a widely accepted assumption that as a product line evolves, its reliability improves. However, evidence in terms of empirical investigation of the relationship among change, reuse and reliability in evolving software product lines is lacking. To address the problem this work investigates: 1) whether reliability as measured by post-deployment failures improves as the products and components in a software product line change over time, and 2) whether the stabilizing effect of shared artifacts enables accurate prediction of failure-prone files in the product line. The first part of this work performs defect assessment and investigates defect trends in Eclipse, an open-source software product line. It analyzes the evolution of the product line over time in terms of the total number of defects, the percentage of severe defects and the relationship between defects and changes. The second part of this work explores prediction of failure-prone files in the Eclipse product line to determine whether prediction improves as the product line evolves over time. In addition, this part investigates the effect of defect and data collection periods on the prediction performance. The main contributions of this work include findings that the majority of files with severe defects are reused files rather than new files, but that common components experience less change than variation components. The work also found that there is a consistent set of metrics which serve as prominent predictors across multiple products and reuse categories over time. Classification of post-release, failure-prone files using change data for the Eclipse product line gives better recall and false positive rates as compared to classification using static code metrics. The work also found that on-going change in product lines hinders the ability to predict failure-prone files, and that predicting post-release defects using pre-release change data for the Eclipse case study is difficult. For example, using more data from the past to predict future failure-prone files does not necessarily give better results than using data only from the recent past. The empirical investigation of product line change and defect data leads to an improved understanding of the interplay among change, reuse and reliability as a product line evolves

    Software reuse cuts both ways:An empirical analysis of its relationship with security vulnerabilities

    Get PDF
    Software reuse is a widely adopted practice among both researchers and practitioners. The relation between security and reuse can go both ways: a system can become more secure by relying on mature dependencies, or more insecure by exposing a larger attack surface via exploitable dependencies. To follow up on a previous study and shed more light on this subject, we further examine the association between software reuse and security threats. In particular, we empirically investigate 1244 open-source projects in a multiple-case study to explore and discuss the distribution of security vulnerabilities between the code created by a development team and the code reused through dependencies. For that, we consider both potential vulnerabilities, as assessed through static analysis, and disclosed vulnerabilities, reported in public databases. The results suggest that larger projects in size are associated with an increase on the amount of potential vulnerabilities in both native and reused code. Moreover, we found a strong correlation between a higher number of dependencies and vulnerabilities. Based on our empirical investigation, it appears that source code reuse is neither a silver bullet to combat vulnerabilities nor a frightening werewolf that entail an excessive number of them

    Software defect prediction: do different classifiers find the same defects?

    Get PDF
    Open Access: This article is distributed under the terms of the Creative Commons Attribution 4.0 International License CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.During the last 10 years, hundreds of different defect prediction models have been published. The performance of the classifiers used in these models is reported to be similar with models rarely performing above the predictive performance ceiling of about 80% recall. We investigate the individual defects that four classifiers predict and analyse the level of prediction uncertainty produced by these classifiers. We perform a sensitivity analysis to compare the performance of Random Forest, NaĆÆve Bayes, RPart and SVM classifiers when predicting defects in NASA, open source and commercial datasets. The defect predictions that each classifier makes is captured in a confusion matrix and the prediction uncertainty of each classifier is compared. Despite similar predictive performance values for these four classifiers, each detects different sets of defects. Some classifiers are more consistent in predicting defects than others. Our results confirm that a unique subset of defects can be detected by specific classifiers. However, while some classifiers are consistent in the predictions they make, other classifiers vary in their predictions. Given our results, we conclude that classifier ensembles with decision-making strategies not based on majority voting are likely to perform best in defect prediction.Peer reviewedFinal Published versio

    Empirical studies of open source evolution

    Get PDF
    Copyright @ 2008 Springer-VerlagThis chapter presents a sample of empirical studies of Open Source Software (OSS) evolution. According to these studies, the classical results from the studies of proprietary software evoltion, such as Lehmanā€™s laws of software evolution, might need to be revised, if not fully, at least in part, to account for the OSS observations. The book chapter also summarises what appears to be the empirical status of each of Lehmanā€™s laws with respect to OSS and highlights the threads to validity that frequently emerge in these empirical studies. The chapter also discusses related topics for further research
    • ā€¦
    corecore