59 research outputs found

    The scientific basis for prediction research

    Get PDF
    Copyright @ 2012 ACMIn recent years there has been a huge growth in using statistical and machine learning methods to find useful prediction systems for software engineers. Of particular interest is predicting project effort and duration and defect behaviour. Unfortunately though results are often promising no single technique dominates and there are clearly complex interactions between technique, training methods and the problem domain. Since we lack deep theory our research is of necessity experimental. Minimally, as scientists, we need reproducible studies. We also need comparable studies. I will show through a meta-analysis of many primary studies that we are not presently in that situation and so the scientific basis for our collective research remains in doubt. By way of remedy I will argue that we need to address these issues of reporting protocols and expertise plus ensure blind analysis is routine

    Data quality: Some comments on the NASA software defect datasets

    Get PDF
    Background-Self-evidently empirical analyses rely upon the quality of their data. Likewise, replications rely upon accurate reporting and using the same rather than similar versions of datasets. In recent years, there has been much interest in using machine learners to classify software modules into defect-prone and not defect-prone categories. The publicly available NASA datasets have been extensively used as part of this research. Objective-This short note investigates the extent to which published analyses based on the NASA defect datasets are meaningful and comparable. Method-We analyze the five studies published in the IEEE Transactions on Software Engineering since 2007 that have utilized these datasets and compare the two versions of the datasets currently in use. Results-We find important differences between the two versions of the datasets, implausible values in one dataset and generally insufficient detail documented on dataset preprocessing. Conclusions-It is recommended that researchers 1) indicate the provenance of the datasets they use, 2) report any preprocessing in sufficient detail to enable meaningful replication, and 3) invest effort in understanding the data prior to applying machine learners

    Modeling Software Project Defects With Fuzzy Logic Maps

    Get PDF
    I propose a defect determining the quality model for software projects. The proposed model is based on fuzzy logic concept. The knowledge behind fuzzy logic process is built with a cognitive map of software defects. This map is developed for software projects, taken into account their characteristic lifecycle. The model was used to test the software project; the probability of not obtaining quality outputs was calculated considering the quality level idea and the over-budget sum. The calculated defect is situated on the maximum level in the defects map. A software system for determining defects in software testing project was also developed

    Risk Assessment Technology on the Application of Admission of New Students in High School

    Get PDF
    The rapid development of technology makes almost all service activities use information technology, including service activities in schools. One of the services provided by the school is to facilitate prospective students who are interested in registering as new students at school by building an open source-based new student registration application so that prospective students can register anywhere without having to come to school directly. The use of this application has several technological obstacles such as the system being locked due to being hacked by hackers, phishing, attacks from viruses, attacks from previous people who know the security of data from a computer system, unstable computer networks that affect the operational process to be slow, and low level of computer security. The purpose of this study is to provide recommendations for controlling the risk of using information technology in new student registration applications so as to minimize future losses to schools by measuring the likelihood and impact of using computer technology. The risk assessment model uses the NIST SP 800-30r1 framework, which is used as a tool to measure how big is the threat level and the impact caused by attacks that attack the application. The NIST SP 800-30r1 framework has stages such as recognizing system characteristics, threats, vulnerabilities, analyzing system handling, determining likelihood, determining impact, risk determination, recommending control, and determining results. The results of this study were used as recommendations to minimize losses obtained by schools and as a benchmark for controlling the risk of using technology to improve the quality of schools. Keywords: risk assessment technology, admission, high schoo

    Too Trivial To Test? An Inverse View on Defect Prediction to Identify Methods with Low Fault Risk

    Get PDF
    Background. Test resources are usually limited and therefore it is often not possible to completely test an application before a release. To cope with the problem of scarce resources, development teams can apply defect prediction to identify fault-prone code regions. However, defect prediction tends to low precision in cross-project prediction scenarios. Aims. We take an inverse view on defect prediction and aim to identify methods that can be deferred when testing because they contain hardly any faults due to their code being "trivial". We expect that characteristics of such methods might be project-independent, so that our approach could improve cross-project predictions. Method. We compute code metrics and apply association rule mining to create rules for identifying methods with low fault risk. We conduct an empirical study to assess our approach with six Java open-source projects containing precise fault data at the method level. Results. Our results show that inverse defect prediction can identify approx. 32-44% of the methods of a project to have a low fault risk; on average, they are about six times less likely to contain a fault than other methods. In cross-project predictions with larger, more diversified training sets, identified methods are even eleven times less likely to contain a fault. Conclusions. Inverse defect prediction supports the efficient allocation of test resources by identifying methods that can be treated with less priority in testing activities and is well applicable in cross-project prediction scenarios.Comment: Submitted to PeerJ C
    corecore