Search CORE

12 research outputs found

Data quality: Some comments on the NASA software defect datasets

Author: Mair C
Shepperd M
Song Q
Sun Z
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2013
Field of study

Background-Self-evidently empirical analyses rely upon the quality of their data. Likewise, replications rely upon accurate reporting and using the same rather than similar versions of datasets. In recent years, there has been much interest in using machine learners to classify software modules into defect-prone and not defect-prone categories. The publicly available NASA datasets have been extensively used as part of this research. Objective-This short note investigates the extent to which published analyses based on the NASA defect datasets are meaningful and comparable. Method-We analyze the five studies published in the IEEE Transactions on Software Engineering since 2007 that have utilized these datasets and compare the two versions of the datasets currently in use. Results-We find important differences between the two versions of the datasets, implausible values in one dataset and generally insufficient detail documented on dataset preprocessing. Conclusions-It is recommended that researchers 1) indicate the provenance of the datasets they use, 2) report any preprocessing in sufficient detail to enable meaningful replication, and 3) invest effort in understanding the data prior to applying machine learners

Crossref

UAL Research Online

Brunel University Research Archive

Analysis of Data mining based Software Defect Prediction Techniques

Author: Naheed Azeem
Shazia Usmani
Publication venue: Global Journals Inc. (US)
Publication date: 23/08/2011
Field of study

Software bug repository is the main resource for fault prone modules. Different data mining algorithms are used to extract fault prone modules from these repositories. Software development team tries to increase the software quality by decreasing the number of defects as much as possible. In this paper different data mining techniques are discussed for identifying fault prone modules as well as compare the data mining algorithms to find out the best algorithm for defect prediction

Global Journal of Computer Science and Technology (GJCST)

Software Defect Prediction Based on Classication Rule Mining

Author: Sahana Dulal Chandra
Publication venue
Publication date: 01/01/2013
Field of study

There has been rapid growth of software development. Due to various causes, the software comes with many defects. In Software development process, testing of software is the main phase which reduces the defects of the software. If a developer or a tester can predict the software defects properly then, it reduces the cost, time and eort. In this paper, we show a comparative analysis of software defect prediction based on classifcation rule mining. We propose a scheme for this process and we choose different classication algorithms. Showing the comparison of predictions in software defects analysis. This evaluation analyzes the prediction performance of competing learning schemes for given historical data sets(NASA MDP Data Set). The result of this scheme evaluation shows that we have to choose different classifer rule for different data set

ethesis@nitr

Fuzzy Logic Based Software Reliability Quantification Framework: Early Stage Perspective (FLSRQF)

Author: Khan Raees Ahmad
Rizvi Syed Wajahat Abbas
Singh Vivek Kumar
Publication venue: The Author(s). Published by Elsevier B.V.
Publication date: 31/12/2016
Field of study

AbstractToday, the influence of information technology has been spreading exponentially, from high level research going on in top labs of the world to the home appliances. Such a huge demand is compelling developers to develop more software to meet the user expectations. As a result reliability has come up as a critical quality factor that cannot be compromised. Therefore, researchers are continuously making efforts to meet this challenge. With this spirit, authors of the paper have proposed a highly structured framework that guides the process of quantifying software reliability, before the coding of the software start. Before presenting the framework, to realize its need and significance, the paper has presented the state-of-the-art on software reliability quantification. The strength of fuzzy set theory has been utilized to prevail over the limitation of subjectivity of requirements stage measures. Salient features of the framework are also highlighted at the end of the paper

Elsevier - Publisher Connector

Evaluating the effectiveness of data quality framework in software engineering

Author: Mohamad Yusop Nor Shahida
Rosli Marshima Mohd
Publication venue: Institute of Advanced Engineering and Science
Publication date: 01/12/2022
Field of study

The quality of data is important in research working with data sets because poor data quality may lead to invalid results. Data sets contain measurements that are associated with metrics and entities; however, in some data sets, it is not always clear which entities have been measured and exactly which metrics have been used. This means that measurements could be misinterpreted. In this study, we develop a framework for data quality assessment that determines whether a data set has sufficient information to support the correct interpretation of data for analysis in empirical research. The framework incorporates a dataset metamodel and a quality assessment process to evaluate the data set quality. To evaluate the effectiveness of our framework, we conducted a user study. We used observations, a questionnaire and think aloud approach to provide insights into the framework through participant thought processes while applying the framework. The results of our study provide evidence that most participants successfully applied the definitions of dataset category elements and the formal definitions of data quality issues to the datasets. Further work is needed to reproduce our results with more participants, and to determine whether the data quality framework is generalizable to other types of data sets

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institute of Advanced Engineering and Science

Revisiting supervised and unsupervised methods for effort-aware cross-project defect prediction

Author: CHEN Xiang
GU Qing
LO David
NI Chao
XIA Xin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2020
Field of study

Institutional Knowledge at Singapore Management University