26,481 research outputs found

    Supporting Defect Causal Analysis in Practice with Cross-Company Data on Causes of Requirements Engineering Problems

    Full text link
    [Context] Defect Causal Analysis (DCA) represents an efficient practice to improve software processes. While knowledge on cause-effect relations is helpful to support DCA, collecting cause-effect data may require significant effort and time. [Goal] We propose and evaluate a new DCA approach that uses cross-company data to support the practical application of DCA. [Method] We collected cross-company data on causes of requirements engineering problems from 74 Brazilian organizations and built a Bayesian network. Our DCA approach uses the diagnostic inference of the Bayesian network to support DCA sessions. We evaluated our approach by applying a model for technology transfer to industry and conducted three consecutive evaluations: (i) in academia, (ii) with industry representatives of the Fraunhofer Project Center at UFBA, and (iii) in an industrial case study at the Brazilian National Development Bank (BNDES). [Results] We received positive feedback in all three evaluations and the cross-company data was considered helpful for determining main causes. [Conclusions] Our results strengthen our confidence in that supporting DCA with cross-company data is promising and should be further investigated.Comment: 10 pages, 8 figures, accepted for the 39th International Conference on Software Engineering (ICSE'17

    Some Approaches for Software Defect Prediction

    Get PDF
    Käesoleva töö peamiseks eesmärgiks on anda üldisem ülevaade protsessidest tarkvara vigade hindamise mudelites, mis kasutavad masinõppe klassifikaatoreid, ja analüüsida mõningaid hindamiseskperimentide tulemusi, mis on läbi viidud antud töös refereeritud uurimistöödes. Lisaks on antud lühike selgitus antud töös vaadeldavates tarkvara vigade hindamise mudelites kasutatud algoritmidest ja tuuakse välja ning seletatakse lahti mõned hinnangumõõdikud, mida kasutatakse tarkvara vigade hindamise mudelite hindamistäpsuste mõõtmiseks. Tuuakse välja ka üldine ülevaade vaadeldavates tarkvara vigade hindamise mudelites toimuvatest protsessidest.The main idea of this thesis is to give a general overview of the processes within the soft-ware defect prediction models using machine learning classifiers and to provide analysis to some of the results of the evaluation experiments conducted in the research papers covered in this work. Additionally, a brief explanation of the algorithms used within the software defect prediction models covered in this work is given and some of the evaluation measures used to evaluate the prediction accuracy of software defect prediction models are listed and explained. Also, a general overview of the processes within a handful of specific software defect prediction models is provided

    Heterogeneous Cross-Project Defect Prediction using Encoder and Transfer Learning

    Get PDF
    Heterogeneous cross-project defect prediction (HCPDP) aims to predict defects in new software projects using defect data from previous software projects where the source and target projects have some different metrics. Most existing methods only find linear relationships in the software defect features and datasets. Additionally, these methods use multiple defect datasets from different projects as source datasets. In this paper, we propose a novel method called heterogeneous cross-project defect prediction using encoder and transfer learning (ETL). ETL uses encoders to extract the important features from source and target datasets. Also, to minimize negative transfer during transfer learning, we used an augmented dataset that contains pseudo-labels and the source dataset. Additionally, we have used very limited data to train the model. To evaluate the performance of the ETL approach, 16 datasets from four publicly available software defect projects were used. Furthermore, we compared the proposed method with four HCPDP methods namely EGW, HDP&amp;#x005F;KS, CTKCCA and EMKCA, and one WPDP method from existing literature. The proposed method on average outperforms the baseline methods in terms of PD, PF, F1-score, G-mean and AUC.</p

    A Cross-project Defect Prediction Model Using Feature Transfer and Ensemble Learning

    Get PDF
    Cross-project defect prediction (CPDP) trains the prediction models with existing data from other projects (the source projects) and uses the trained model to predict the target projects. To solve two major problems in CPDP, namely, variability in data distribution and class imbalance, in this paper we raise a CPDP model combining feature transfer and ensemble learning, with two stages of feature transfer and the classification. The feature transfer method is based on Pearson correlation coefficient, which reduces the dimension of feature space and the difference of feature distribution between items. The class imbalance is solved by SMOTE and Voting on both algorithm and data levels. The experimental results on 20 source-target projects show that our method can yield significant improvement on CPDP

    BiLO-CPDP: Bi-Level Programming for Automated Model Discovery in Cross-Project Defect Prediction

    Full text link
    Cross-Project Defect Prediction (CPDP), which borrows data from similar projects by combining a transfer learner with a classifier, have emerged as a promising way to predict software defects when the available data about the target project is insufficient. How-ever, developing such a model is challenge because it is difficult to determine the right combination of transfer learner and classifier along with their optimal hyper-parameter settings. In this paper, we propose a tool, dubbedBiLO-CPDP, which is the first of its kind to formulate the automated CPDP model discovery from the perspective of bi-level programming. In particular, the bi-level programming proceeds the optimization with two nested levels in a hierarchical manner. Specifically, the upper-level optimization routine is designed to search for the right combination of transfer learner and classifier while the nested lower-level optimization routine aims to optimize the corresponding hyper-parameter settings.To evaluateBiLO-CPDP, we conduct experiments on 20 projects to compare it with a total of 21 existing CPDP techniques, along with its single-level optimization variant and Auto-Sklearn, a state-of-the-art automated machine learning tool. Empirical results show that BiLO-CPDP champions better prediction performance than all other 21 existing CPDP techniques on 70% of the projects, while be-ing overwhelmingly superior to Auto-Sklearn and its single-level optimization variant on all cases. Furthermore, the unique bi-level formalization inBiLO-CPDP also permits to allocate more budget to the upper-level, which significantly boosts the performance

    Deep Learning for Software Defect Prediction: An LSTM-based Approach

    Get PDF
    Software defect prediction is an important aspect of software development, as it helps developers and organizations to identify and resolve bugs in the software before they become major issues. In this paper, we explore the use of machine learning algorithms for software defect prediction. We discuss the different types of machine learning algorithms that have been used for software defect prediction and their advantages and disadvantages. We also provide a comprehensive review of recent studies that have used machine learning algorithms for software defect prediction. The paper concludes with a discussion of the challenges and opportunities in using machine learning algorithms for software defect prediction and the future directions of research in this field. This paper surveys the existing literature on software defect prediction, focusing specifically on deep learning techniques. Compared to existing surveys on the topic, this paper offers a more in-depth analysis of the strengths and weaknesses of deep learning approaches for software defect prediction. It explores the use of LSTMs for this task, which have not been extensively studied in previous surveys. Additionally, this paper provides a comprehensive review of recent research in the field, highlighting the most promising deep learning models and techniques for software defect prediction. The results of this survey demonstrate that LSTM-based deep learning models can outperform traditional machine learning approaches and achieve state-of-the-art results in software defect prediction. Furthermore, this paper provides insights into the challenges and limitations of deep learning approaches for software defect prediction, highlighting areas for future research and improvement. Overall, this paper offers a valuable resource for researchers and practitioners interested in using deep learning techniques for software defect prediction.

    How Far Does the Predictive Decision Impact the Software Project? The Cost, Service Time, and Failure Analysis from a Cross-Project Defect Prediction Model

    Full text link
    Context: Cross-project defect prediction (CPDP) models are being developed to optimize the testing resources. Objectives: Proposing an ensemble classification framework for CPDP as many existing models are lacking with better performances and analysing the main objectives of CPDP from the outcomes of the proposed classification framework. Method: For the classification task, we propose a bootstrap aggregation based hybrid-inducer ensemble learning (HIEL) technique that uses probabilistic weighted majority voting (PWMV) strategy. To know the impact of HIEL on the software project, we propose three project-specific performance measures such as percent of perfect cleans (PPC), percent of non-perfect cleans (PNPC), and false omission rate (FOR) from the predictions to calculate the amount of saved cost, remaining service time, and percent of the failures in the target project. Results: On many target projects from PROMISE, NASA, and AEEEM repositories, the proposed model outperformed recent works such as TDS, TCA+, HYDRA, TPTL, and CODEP in terms of F-measure. In terms of AUC, the TCA+ and HYDRA models stand as strong competitors to the HIEL model. Conclusion: For better predictions, we recommend ensemble learning approaches for the CPDP models. And, to estimate the benefits from the CPDP models, we recommend the above project-specific performance measures
    corecore