Search CORE

8,623 research outputs found

An Empirical analysis of Open Source Software Defects data through Software Reliability Growth Models

Author: Morisio Maurizio
Ullah Najeeb
Publication venue
Publication date: 01/01/2013
Field of study

The purpose of this study is to analyze the reliability growth of Open Source Software (OSS) using Software Reliability Growth Models (SRGM). This study uses defects data of twenty five different releases of five OSS projects. For each release of the selected projects two types of datasets have been created; datasets developed with respect to defect creation date (created date DS) and datasets developed with respect to defect updated date (updated date DS). These defects datasets are modelled by eight SRGMs; Musa Okumoto, Inflection S-Shaped, Goel Okumoto, Delayed S-Shaped, Logistic, Gompertz, Yamada Exponential, and Generalized Goel Model. These models are chosen due to their widespread use in the literature. The SRGMs are fitted to both types of defects datasets of each project and the their fitting and prediction capabilities are analysed in order to study the OSS reliability growth with respect to defects creation and defects updating time because defect analysis can be used as a constructive reliability predictor. Results show that SRGMs fitting capabilities and prediction qualities directly increase when defects creation date is used for developing OSS defect datasets to characterize the reliability growth of OSS. Hence OSS reliability growth can be characterized with SRGM in a better way if the defect creation date is taken instead of defects updating (fixing) date while developing OSS defects datasets in their reliability modellin

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Bug or Not? Bug Report Classification Using N-Gram IDF

Author: Hata Hideaki
Matsumoto Kenichi
Phannachitta Passakorn
Terdchanakul Pannavat
Publication venue
Publication date: 01/01/2017
Field of study

Previous studies have found that a significant number of bug reports are misclassified between bugs and non-bugs, and that manually classifying bug reports is a time-consuming task. To address this problem, we propose a bug reports classification model with N-gram IDF, a theoretical extension of Inverse Document Frequency (IDF) for handling words and phrases of any length. N-gram IDF enables us to extract key terms of any length from texts, these key terms can be used as the features to classify bug reports. We build classification models with logistic regression and random forest using features from N-gram IDF and topic modeling, which is widely used in various software engineering tasks. With a publicly available dataset, our results show that our N-gram IDF-based models have a superior performance than the topic-based models on all of the evaluated cases. Our models show promising results and have a potential to be extended to other software engineering tasks.Comment: 5 pages, ICSME 201

arXiv.org e-Print Archive

NAIST Academic Repository

Crossref

Empirical Assessment of the Impact of Automatic Static Analysis on Code Quality

Author: Vetro' Antonio
Publication venue
Publication date: 01/01/2010
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Categorizing and predicting reopened bug reports to improve software reliability

Author: Gawade Rishikesh
Publication venue: DigitalCommons@UNO
Publication date: 01/08/2013
Field of study

Software maintenance takes two thirds of the life cycle of the project. Bug fixes are an important part of software maintenance. Bugs are tracked using online tools like Bugzilla. It has been noted that around 10% of fixes are buggy fixes. Many bugs are documented as fixed when they are not actually fixed, thus reducing the reliability of the software. The overlooked bugs are critical as they take more resources to fix when discovered, and since they are not documented, the reality is that defect are still present and reduce reliability of software. There have been very few studies in understanding these bugs. The best way to understand these bugs is to mine software repositories. To generalize findings we need a large number of bug information and a wide category of software projects. To solve the problem, a web crawler collected around a million bug reports from online repositories, and extracted important attributes of the bug reports. We selected four algorithms: Bayesian network, NaiveBayes, C4.5 decision tree, and Alternating decision tree. We achieved a decent amount of accuracy in predicting reopened bugs across a wide range of projects. Using AdaBoost, we analyzed the most important factors responsible for the bugs and categorized them in three categories of reputation of committer, complex units, and insufficient knowledge of defect

The University of Nebraska, Omaha