Search CORE

201 research outputs found

Categorizing and predicting reopened bug reports to improve software reliability

Author: Gawade Rishikesh
Publication venue: DigitalCommons@UNO
Publication date: 01/08/2013
Field of study

Software maintenance takes two thirds of the life cycle of the project. Bug fixes are an important part of software maintenance. Bugs are tracked using online tools like Bugzilla. It has been noted that around 10% of fixes are buggy fixes. Many bugs are documented as fixed when they are not actually fixed, thus reducing the reliability of the software. The overlooked bugs are critical as they take more resources to fix when discovered, and since they are not documented, the reality is that defect are still present and reduce reliability of software. There have been very few studies in understanding these bugs. The best way to understand these bugs is to mine software repositories. To generalize findings we need a large number of bug information and a wide category of software projects. To solve the problem, a web crawler collected around a million bug reports from online repositories, and extracted important attributes of the bug reports. We selected four algorithms: Bayesian network, NaiveBayes, C4.5 decision tree, and Alternating decision tree. We achieved a decent amount of accuracy in predicting reopened bugs across a wide range of projects. Using AdaBoost, we analyzed the most important factors responsible for the bugs and categorized them in three categories of reputation of committer, complex units, and insufficient knowledge of defect

The University of Nebraska, Omaha

Understanding the Impact of Diversity in Software Bugs on Bug Prediction Models

Author: Valdivia-Garcia Harold
Publication venue: RIT Scholar Works
Publication date: 22/07/2016
Field of study

Nowadays, software systems are essential for businesses, users and society. At the same time such systems are growing both in complexity and size. In this context, developing high-quality software is a challenging and expensive activity for the software industry. Since software organizations are always limited by their budget, personnel and time, it is not a trivial task to allocate testing and code-review resources to areas that require the most attention. To overcome the above problem, researchers have developed software bug prediction models that can help practitioners to predict the most bug-prone software entities. Although, software bug prediction is a very popular research area, yet its industrial adoption remains limited. In this thesis, we investigate three possible issues with the current state-of-the-art in software bug prediction that affect the practical usability of prediction models. First, we argue that current bug prediction models implicitly assume that all bugs are the same without taking into consideration their impact. We study the impact of bugs in terms of experience of the developers required to fix them. Second, only few studies investigate the impact of specific type of bugs. Therefore, we characterize a severe type of bug called Blocking bugs, and provide approaches to predict them early on. Third, false-negative files are buggy files that bug prediction models incorrectly as non-buggy files. We argue that a large number of false-negative files makes bug prediction models less attractive for developers. In our thesis, we quantify the extent of false-negative files, and manually inspect them in order to better understand their nature

RIT Scholar Works

Characterizing and Predicting Blocking Bugs in Open Source Projects

Author: Nagappan Mei
Shihab Emad
Valdivia-Garcia Harold
Publication venue: 'Elsevier BV'
Publication date: 06/04/2018
Field of study

Software engineering researchers have studied specific types of issues such reopened bugs, performance bugs, dormant bugs, etc. However, one special type of severe bugs is blocking bugs. Blocking bugs are software bugs that prevent other bugs from being fixed. These bugs may increase maintenance costs, reduce overall quality and delay the release of the software systems. In this paper, we study blocking bugs in eight open source projects and propose a model to predict them early on. We extract 14 different factors (from the bug repositories) that are made available within 24 hours after the initial submission of the bug reports. Then, we build decision trees to predict whether a bug will be a blocking bugs or not. Our results show that our prediction models achieve F-measures of 21%-54%, which is a two-fold improvement over the baseline predictors. We also analyze the fixes of these blocking bugs to understand their negative impact. We find that fixing blocking bugs requires more lines of code to be touched compared to non-blocking bugs. In addition, our file-level analysis shows that files affected by blocking bugs are more negatively impacted in terms of cohesion, coupling complexity and size than files affected by non-blocking bugs

Crossref

Concordia University Research Repository

Towards Automated Performance Bug Identification in Python

Author: Mazzawi Elie
Miranskyy Andriy
Tsakiltsidis Sokratis
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/07/2016
Field of study

Context: Software performance is a critical non-functional requirement, appearing in many fields such as mission critical applications, financial, and real time systems. In this work we focused on early detection of performance bugs; our software under study was a real time system used in the advertisement/marketing domain. Goal: Find a simple and easy to implement solution, predicting performance bugs. Method: We built several models using four machine learning methods, commonly used for defect prediction: C4.5 Decision Trees, Na\"{\i}ve Bayes, Bayesian Networks, and Logistic Regression. Results: Our empirical results show that a C4.5 model, using lines of code changed, file's age and size as explanatory variables, can be used to predict performance bugs (recall=0.73, accuracy=0.85, and precision=0.96). We show that reducing the number of changes delivered on a commit, can decrease the chance of performance bug injection. Conclusions: We believe that our approach can help practitioners to eliminate performance bugs early in the development cycle. Our results are also of interest to theoreticians, establishing a link between functional bugs and (non-functional) performance bugs, and explicitly showing that attributes used for prediction of functional bugs can be used for prediction of performance bugs

arXiv.org e-Print Archive

Crossref

Recommended from our members

Analysing the Resolution of Security Bugs in Software Maintenance

Author: Saleem Saad Bin
Publication venue
Publication date: 22/05/2015
Field of study

Security bugs in software systems are often reported after incidents of malicious attacks. Developers often need to resolve these bugs quickly in order to maintain the security of such systems. Bug resolution includes two kinds of activities: triaging confirms that the bugs are indeed security problems, after which fixing involves making changes to the code. It is reported in the literature that, statistically, security bugs are reopened more often compared to others, which poses two new research questions: (a) Are developers “rushing” to triage security bugs too soon under the pressure of deadlines? (b) Do developers need to spend more time fixing security bugs to avoid frequent reopening? This thesis explores these questions in order to determine whether security bug fixing should take a higher priority than other bugs to avoid malicious attackers exploiting vulnerabilities before the problems are fixed, and whether security bug fixing should take a higher priority than other bugs. In this thesis a quantitative approach has been adopted by conducting statistical empirical studies to observe the behaviour of software developers engaged in dealing with security bugs. Firstly, the concept of "rush'' has been borrowed from the time management literature to refer to the behaviour of people delivering work under the pressure of deadlines. By observing how developers deliver bug resolution before the deadline of releases, the degree of rush has been measured as the ratio between the actual time spent by developers during triaging and the theoretical time the developers have by delaying the fixes until the next regular release. In this thesis, a suggest that delaying bug assignment helps find the right developer and gives the developer more time to prepare for the same workload with more relaxed planning constraints. Secondly, to analyse the complexity of security bug fixes, the fan-in complexity of functions relevant to security bugs has been measured, rather than simply measuring the time spent by the software developers on the fixing of such bugs. The first null hypothesis is tested using a Man-Whitney method on five software case studies, Samba, MozillaFirefox, RedHat, FreeBSD and Mozilla. The second null hypothesis is tested by comparing the results of fixing security and non-security bugs from the Samba and MozillaFirefox case studies. Statistically significant results suggest that security bugs are triaged in a rush compared to non-security bugs for RedHat, FreeBSD and Mozilla. In terms of fan-in, the results of the Samba and MozillaFirefox case studies suggest that security bugs are more complex to fix compared to non-security bugs

Open Research Online (The Open University)

Classifying Web Exploits with Topic Modeling

Author: Ruohonen Jukka
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/10/2017
Field of study

This short empirical paper investigates how well topic modeling and database meta-data characteristics can classify web and other proof-of-concept (PoC) exploits for publicly disclosed software vulnerabilities. By using a dataset comprised of over 36 thousand PoC exploits, near a 0.9 accuracy rate is obtained in the empirical experiment. Text mining and topic modeling are a significant boost factor behind this classification performance. In addition to these empirical results, the paper contributes to the research tradition of enhancing software vulnerability information with text mining, providing also a few scholarly observations about the potential for semi-automatic classification of exploits in the existing tracking infrastructures.Comment: Proceedings of the 2017 28th International Workshop on Database and Expert Systems Applications (DEXA). http://ieeexplore.ieee.org/abstract/document/8049693

arXiv.org e-Print Archive

Crossref