711 research outputs found
Categorizing and predicting reopened bug reports to improve software reliability
Software maintenance takes two thirds of the life cycle of the project. Bug fixes are an important part of software maintenance. Bugs are tracked using online tools like Bugzilla. It has been noted that around 10% of fixes are buggy fixes. Many bugs are documented as fixed when they are not actually fixed, thus reducing the reliability of the software. The overlooked bugs are critical as they take more resources to fix when discovered, and since they are not documented, the reality is that defect are still present and reduce reliability of software. There have been very few studies in understanding these bugs. The best way to understand these bugs is to mine software repositories. To generalize findings we need a large number of bug information and a wide category of software projects. To solve the problem, a web crawler collected around a million bug reports from online repositories, and extracted important attributes of the bug reports. We selected four algorithms: Bayesian network, NaiveBayes, C4.5 decision tree, and Alternating decision tree. We achieved a decent amount of accuracy in predicting reopened bugs across a wide range of projects. Using AdaBoost, we analyzed the most important factors responsible for the bugs and categorized them in three categories of reputation of committer, complex units, and insufficient knowledge of defect
Recommended from our members
Analysing the Resolution of Security Bugs in Software Maintenance
Security bugs in software systems are often reported after incidents of malicious attacks. Developers often need to resolve these bugs quickly in order to maintain the security of such systems. Bug resolution includes two kinds of activities: triaging confirms that the bugs are indeed security problems, after which fixing involves making changes to the code.
It is reported in the literature that, statistically, security bugs are reopened more often compared to others, which poses two new research questions: (a) Are developers “rushing” to triage security bugs too soon under the pressure of deadlines? (b) Do developers need to spend more time fixing security bugs to avoid frequent reopening?
This thesis explores these questions in order to determine whether security bug fixing should take a higher priority than other bugs to avoid malicious attackers exploiting vulnerabilities before the problems are fixed, and whether security bug fixing should take a higher priority than other bugs.
In this thesis a quantitative approach has been adopted by conducting statistical empirical studies to observe the behaviour of software developers engaged in dealing with security bugs.
Firstly, the concept of "rush'' has been borrowed from the time management literature to refer to the behaviour of people delivering work under the pressure of deadlines. By observing how developers deliver bug resolution before the deadline of releases, the degree of rush has been measured as the ratio between the actual time spent by developers during triaging and the theoretical time the developers have by delaying the fixes until the next regular release.
In this thesis, a suggest that delaying bug assignment helps find the right developer and gives the developer more time to prepare for the same workload with more relaxed planning constraints. Secondly, to analyse the complexity of security bug fixes, the fan-in complexity of functions relevant to security bugs has been measured, rather than simply measuring the time spent by the software developers on the fixing of such bugs.
The first null hypothesis is tested using a Man-Whitney method on five software case studies, Samba, MozillaFirefox, RedHat, FreeBSD and Mozilla. The second null hypothesis is tested by comparing the results of fixing security and non-security bugs from the Samba and MozillaFirefox case studies.
Statistically significant results suggest that security bugs are triaged in a rush compared to non-security bugs for RedHat, FreeBSD and Mozilla.
In terms of fan-in, the results of the Samba and MozillaFirefox case studies suggest that security bugs are more complex to fix compared to non-security bugs
Characterizing and Predicting Blocking Bugs in Open Source Projects
Software engineering researchers have studied specific types of issues such reopened bugs, performance bugs, dormant bugs, etc. However, one special type of severe bugs is blocking bugs. Blocking bugs are software bugs that prevent other bugs from being fixed. These bugs may increase maintenance costs, reduce overall quality and delay the release of the software systems. In this paper, we study blocking bugs in eight open source projects and propose a model to predict them early on. We extract 14 different factors (from the bug repositories) that are made available within 24 hours after the initial submission of the bug reports. Then, we build decision trees to predict whether a bug will be a blocking bugs or not. Our results show that our prediction models achieve F-measures of 21%-54%, which is a two-fold improvement over the baseline predictors. We also analyze the fixes of these blocking bugs to understand their negative impact. We find that fixing blocking bugs requires more lines of code to be touched compared to non-blocking bugs. In addition, our file-level analysis shows that files affected by blocking bugs are more negatively impacted in terms of cohesion, coupling complexity and size than files affected by non-blocking bugs
Understanding the Impact of Diversity in Software Bugs on Bug Prediction Models
Nowadays, software systems are essential for businesses, users and society. At the same time such systems are growing both in complexity and size. In this context, developing high-quality software is a challenging and expensive activity for the software industry. Since software organizations are always limited by their budget, personnel and time, it is not a trivial task to allocate testing and code-review resources to areas that require the most attention. To overcome the above problem, researchers have developed software bug prediction models that can help practitioners to predict the most bug-prone software entities. Although, software bug prediction is a very popular research area, yet its industrial adoption remains limited.
In this thesis, we investigate three possible issues with the current state-of-the-art in software bug prediction that affect the practical usability of prediction models. First, we argue that current bug prediction models implicitly assume that all bugs are the same without taking into consideration their impact. We study the impact of bugs in terms of experience of the developers required to fix them. Second, only few studies investigate the impact of specific type of bugs. Therefore, we characterize a severe type of bug called Blocking bugs, and provide approaches to predict them early on. Third, false-negative files are buggy files that bug prediction models incorrectly as non-buggy files. We argue that a large number of false-negative files makes bug prediction models less attractive for developers. In our thesis, we quantify the extent of false-negative files, and manually inspect them in order to better understand their nature
Data-Driven Application Maintenance: Views from the Trenches
In this paper we present our experience during design, development, and pilot
deployments of a data-driven machine learning based application maintenance
solution. We implemented a proof of concept to address a spectrum of
interrelated problems encountered in application maintenance projects including
duplicate incident ticket identification, assignee recommendation, theme
mining, and mapping of incidents to business processes. In the context of IT
services, these problems are frequently encountered, yet there is a gap in
bringing automation and optimization. Despite long-standing research around
mining and analysis of software repositories, such research outputs are not
adopted well in practice due to the constraints these solutions impose on the
users. We discuss need for designing pragmatic solutions with low barriers to
adoption and addressing right level of complexity of problems with respect to
underlying business constraints and nature of data.Comment: Earlier version of paper appearing in proceedings of the 4th
International Workshop on Software Engineering Research and Industrial
Practice (SER&IP), IEEE Press, pp. 48-54, 201
Determinants of Success of the Open Source Selective Revealing Strategy: Solution Knowledge Emergence
Recent research suggests that firms may be able to create a competitive advantage by deliberately revealing specific problem knowledge beyond firm boundaries to open source meta-organisations such that new solution knowledge is created that benefits the focal firm more than its competitors (Alexy, George, & Salter, 2013). Yet, not all firms that use knowledge revealing strategies are successful in inducing the emergence of solution knowledge. The extant literature has as of yet not explained this heterogeneity in success of knowledge revealing strategies. Using a longitudinal database spanning the period from 1998 to end 2012 with more than 2 billion data points that was obtained from the Mozilla Foundation, one of the top open source meta-organisations, this dissertation identifies and measures the antecedent factors affecting successful solution knowledge emergence. The results reveal 35 antecedent factors that affect solution knowledge emergence in different ways across three levels of analysis. The numerous contributions to theory and practice that follow from the results are discussed
- …