139 research outputs found

    Evidence-based defect assessment and prediction for software product lines

    Get PDF
    The systematic reuse provided by software product lines provides opportunities to achieve increased quality and reliability as a product line matures. This has led to a widely accepted assumption that as a product line evolves, its reliability improves. However, evidence in terms of empirical investigation of the relationship among change, reuse and reliability in evolving software product lines is lacking. To address the problem this work investigates: 1) whether reliability as measured by post-deployment failures improves as the products and components in a software product line change over time, and 2) whether the stabilizing effect of shared artifacts enables accurate prediction of failure-prone files in the product line. The first part of this work performs defect assessment and investigates defect trends in Eclipse, an open-source software product line. It analyzes the evolution of the product line over time in terms of the total number of defects, the percentage of severe defects and the relationship between defects and changes. The second part of this work explores prediction of failure-prone files in the Eclipse product line to determine whether prediction improves as the product line evolves over time. In addition, this part investigates the effect of defect and data collection periods on the prediction performance. The main contributions of this work include findings that the majority of files with severe defects are reused files rather than new files, but that common components experience less change than variation components. The work also found that there is a consistent set of metrics which serve as prominent predictors across multiple products and reuse categories over time. Classification of post-release, failure-prone files using change data for the Eclipse product line gives better recall and false positive rates as compared to classification using static code metrics. The work also found that on-going change in product lines hinders the ability to predict failure-prone files, and that predicting post-release defects using pre-release change data for the Eclipse case study is difficult. For example, using more data from the past to predict future failure-prone files does not necessarily give better results than using data only from the recent past. The empirical investigation of product line change and defect data leads to an improved understanding of the interplay among change, reuse and reliability as a product line evolves

    Automatic bug triaging techniques using machine learning and stack traces

    Get PDF
    When a software system crashes, users have the option to report the crash using automated bug tracking systems. These tools capture software crash and failure data (e.g., stack traces, memory dumps, etc.) from end-users. These data are sent in the form of bug (crash) reports to the software development teams to uncover the causes of the crash and provide adequate fixes. The reports are first assessed (usually in a semi-automatic way) by a group of software analysts, known as triagers. Triagers assign priority to the bugs and redirect them to the software development teams in order to provide fixes. The triaging process, however, is usually very challenging. The problem is that many of these reports are caused by similar faults. Studies have shown that one way to improve the bug triaging process is to detect automatically duplicate (or similar) reports. This way, triagers would not need to spend time on reports caused by faults that have already been handled. Another issue is related to the prioritization of bug reports. Triagers often rely on the information provided by the customers (the report submitters) to prioritize bug reports. However, this task can be quite tedious and requires tool support. Next, triagers route the bug report to the responsible development team based on the subsystem, which caused the crash. Since having knowledge of all the subsystems of an ever-evolving industrial system is impractical, having a tool to automatically identify defective subsystems can significantly reduce the manual bug triaging effort. The main goal of this research is to investigate techniques and tools to help triagers process bug reports. We start by studying the effect of the presence of stack traces in analyzing bug reports. Next, we present a framework to help triagers in each step of the bug triaging process. We propose a new and scalable method to automatically detect duplicate bug reports using stack traces and bug report categorical features. We then propose a novel approach for predicting bug severity using stack traces and categorical features, and finally, we discuss a new method for predicting faulty product and component fields of bug reports. We evaluate the effectiveness of our techniques using bug reports from two large open-source systems. Our results show that stack traces and machine learning methods can be used to automate the bug triaging process, and hence increase the productivity of bug triagers, while reducing costs and efforts associated with manual triaging of bug reports

    Analyzing the Influence of Processor Speed and Clock Speed on Remaining Useful Life Estimation of Software Systems

    Full text link
    Prognostics and Health Management (PHM) is a discipline focused on predicting the point at which systems or components will cease to perform as intended, typically measured as Remaining Useful Life (RUL). RUL serves as a vital decision-making tool for contingency planning, guiding the timing and nature of system maintenance. Historically, PHM has primarily been applied to hardware systems, with its application to software only recently explored. In a recent study we introduced a methodology and demonstrated how changes in software can impact the RUL of software. However, in practical software development, real-time performance is also influenced by various environmental attributes, including operating systems, clock speed, processor performance, RAM, machine core count and others. This research extends the analysis to assess how changes in environmental attributes, such as operating system and clock speed, affect RUL estimation in software. Findings are rigorously validated using real performance data from controlled test beds and compared with predictive model-generated data. Statistical validation, including regression analysis, supports the credibility of the results. The controlled test bed environment replicates and validates faults from real applications, ensuring a standardized assessment platform. This exploration yields actionable knowledge for software maintenance and optimization strategies, addressing a significant gap in the field of software health management

    Software Reliability models for the first stage of Software Projects

    Get PDF
    A software reliability analysis for the first stage of software projects is presented. At this very first stage of testing we expect an increasing failure rate, where the usual software reliability growth models based on non homogeneous Poisson processes like the Goel-Okumoto or Musa-Okumoto can not be applied. However, our analysis involves some models that combine reliability growth with increasing failure rates like the logistic and delayed S-shaped models. Our analysis also includes a new model based on contagion as in the increasing failure rate as in the reliability growth stages. We point out that increasing failure rate stages are important to be modeled since corrective actions can be taken soon and also that this characteristics highlights under modern development methodologies which development is performed simultaneously as testing, like in Agile and TDD (Test driven development). Results of the application of those models to real datasets is shown.Sociedad Argentina de Informática e Investigación Operativ

    Software Reliability models for the first stage of Software Projects

    Get PDF
    A software reliability analysis for the first stage of software projects is presented. At this very first stage of testing we expect an increasing failure rate, where the usual software reliability growth models based on non homogeneous Poisson processes like the Goel-Okumoto or Musa-Okumoto can not be applied. However, our analysis involves some models that combine reliability growth with increasing failure rates like the logistic and delayed S-shaped models. Our analysis also includes a new model based on contagion as in the increasing failure rate as in the reliability growth stages. We point out that increasing failure rate stages are important to be modeled since corrective actions can be taken soon and also that this characteristics highlights under modern development methodologies which development is performed simultaneously as testing, like in Agile and TDD (Test driven development). Results of the application of those models to real datasets is shown.Sociedad Argentina de Informática e Investigación Operativ

    Data Mining and Machine Learning for Software Engineering

    Get PDF
    Software engineering is one of the most utilizable research areas for data mining. Developers have attempted to improve software quality by mining and analyzing software data. In any phase of software development life cycle (SDLC), while huge amount of data is produced, some design, security, or software problems may occur. In the early phases of software development, analyzing software data helps to handle these problems and lead to more accurate and timely delivery of software projects. Various data mining and machine learning studies have been conducted to deal with software engineering tasks such as defect prediction, effort estimation, etc. This study shows the open issues and presents related solutions and recommendations in software engineering, applying data mining and machine learning techniques

    SZZ in the time of Pull Requests

    Full text link
    In the multi-commit development model, programmers complete tasks (e.g., implementing a feature) by organizing their work in several commits and packaging them into a commit-set. Analyzing data from developers using this model can be useful to tackle challenging developers' needs, such as knowing which features introduce a bug as well as assessing the risk of integrating certain features in a release. However, to do so one first needs to identify fix-inducing commit-sets. For such an identification, the SZZ algorithm is the most natural candidate, but its performance has not been evaluated in the multi-commit context yet. In this study, we conduct an in-depth investigation on the reliability and performance of SZZ in the multi-commit model. To obtain a reliable ground truth, we consider an already existing SZZ dataset and adapt it to the multi-commit context. Moreover, we devise a second dataset that is more extensive and directly created by developers as well as Quality Assurance (QA) engineers of Mozilla. Based on these datasets, we (1) test the performance of B-SZZ and its non-language-specific SZZ variations in the context of the multi-commit model, (2) investigate the reasons behind their specific behavior, and (3) analyze the impact of non-relevant commits in a commit-set and automatically detect them before using SZZ

    Learning to classify software defects from crowds: a novel approach

    Get PDF
    In software engineering, associating each reported defect with a cate- gory allows, among many other things, for the appropriate allocation of resources. Although this classification task can be automated using stan- dard machine learning techniques, the categorization of defects for model training requires expert knowledge, which is not always available. To cir- cumvent this dependency, we propose to apply the learning from crowds paradigm, where training categories are obtained from multiple non-expert annotators (and so may be incomplete, noisy or erroneous) and, dealing with this subjective class information, classifiers are efficiently learnt. To illustrate our proposal, we present two real applications of the IBM’s or- thogonal defect classification working on the issue tracking systems from two different real domains. Bayesian network classifiers learnt using two state-of-the-art methodologies from data labeled by a crowd of annotators are used to predict the category (impact) of reported software defects. The considered methodologies show enhanced performance regarding the straightforward solution (majority voting) according to different metrics. This shows the possibilities of using non-expert knowledge aggregation techniques when expert knowledge is unavailable
    • …