    A Novel Way of Assessing Software Bug Severity Using Dictionary of Critical Terms

    AbstractDue to increase in demands of software and decreased delivery span of software, assuring the quality of software is becoming a challenge. However, no software can claim to be error free due to the complexity of software and inadequate testing. There is a well-known principle of testing, which states that exhaustive testing is impossible. Hence, maintenance activities are required to ensure smooth functioning of the software. Many open source software provides bug tracking systems to aid corrective maintenance task. These bug tracking systems allow users to report the bugs that are encountered while operating the software. However, in software maintenance, severity prediction has gained much attention recently. Bugs having higher severity should be fixed prior to the bugs having lesser severity. Triager analyzes the bug reports and assesses the severity based upon his/her knowledge and experience. But due to the presence of a large number of bug reports, it becomes a tedious job to manually assign severity. Thus, there is growing need for making the whole process of severity prediction automatic. The paper presents an approach of creating a dictionary of critical terms specifying severity using two different feature selection methods, namely- info gain and Chi square and classification of bug reports are performed using Naïve Bayes Multinomial (NBM) and K-nearest neighbor (KNN) algorithms

    Recommending Issue Reports to Developers Using Machine Learning

    Tarkvarasüsteemide arendust viiakse tihti läbi iteratiivse protsessina ning erinevad tööüleasnded tekkivad siis kui leitakse defekte või tekib vajadus uue funktsionaalsuse järele. Need ülesanded salvestatakse probleemihalduse süsteemi, kust arendajad saavad sisendit oma tööle. Ülesannete jaotamine arendajatele võib toimude mitmel eri viisil. Üks populaarsemaid lähenemisi näeb ette, et arendajad valivad ise ülesandeid, mis neid huvitavad. Suurtes projektides võib see aga muutuda keeruliseks: ülesannete suure arvu tõttu on arendajatel raske aegsasti valida omale huvitav tööülesanne. Selle probleemi leevendamiseks esitatakse antud töös masinõppel põhinev soovitussüsteem, mis on võimeline probleemihalduse süsteemi ajaloost õppima milliseid ülesandeid on iga arendaja eelnevalt täitnud ja selle põhjal soovitada neile uusi ülesandeid. Süsteemi arendamiseks koguti 6 erinevast avatud lähtekoodiga projektist ülesandeid, kasutati erinevaid masinõppe meetodeid ja võrreldi tulemusi, et leida sobivaim. Soovitussüsteemi jõudluse hindamiseks kasutati täpsuse (precision), saagise (recall), f1-skoori (f1-score) ja keskmise täpsuse (mean average precision) mõõdikuid. Tulemused näitavad, et 100 tööülesande kirjelduse põhjal 10 igale arendajale sobivaima soovitamise puhul võib saavutada saagise 52.9% ja 96% vahel, mis on 6 kuni 9.5 korda parem 10 juhusliku töökirjelduse valimisest. Sarnased parandused saavutati ka teistes mõõdikutes.The development of a software system is often done through an iterative process and different change requests arise when bugs and defects are detected or new features need to be added. These requirements are recorded as issue reports and put in the backlog of the software project for developers to work on. The assignment of these issue reports to developers is done in different ways. One common approach is self-assignment, where the developers themselves pick the issue reports they are interested in and assign themselves. Practising self-assignment in large projects can be challenging for developers because the backlog of large projects become loaded with many issue reports, which makes it hard for developers to filter out the issue reports in line with their interest. To tackle this problem, a machine learning-based recommender system is proposed in this thesis. This recommender system can learn from the history of the issue reports that each developer worked on previously and recommend new issue reports suited to each developer. To implement this recommender system, issue reports were collected from 6 different opensource projects and different machine learning techniques were applied and compared in order to determine the most suitable one. For evaluating the performance of the recommender system, the Precision, Recall, F1-score and Mean Average Precision metrics were used. The results show that, from a backlog of 100 issue reports, by recommending the top 10 issue reports to each developer a recall ranging from 52.9% up to 96% can be achieved, which is 6 up to 9.5 times better than picking 10 issue reports randomly. Comparable improvements were also achieved in the other metrics

    A multi-label, dual-output deep neural network for automated bug triaging

    Bug tracking enables the monitoring and resolution of issues and bugs within organizations. Bug triaging, or assigning bugs to the owner(s) who will resolve them, is a critical component of this process because there are many incorrect assignments that waste developer time and reduce bug resolution throughput. In this work, we explore the use of a novel two-output deep neural network architecture (Dual DNN) for triaging a bug to both an individual team and developer, simultaneously. Dual DNN leverages this simultaneous prediction by exploiting its own guess of the team classes to aid in developer assignment. A multi-label classification approach is used for each of the two outputs to learn from all interim owners, not just the last one who closed the bug. We make use of a heuristic combination of the interim owners (owner-importance-weighted labeling) which is converted into a probability mass function (pmf). We employ a two-stage learning scheme, whereby the team portion of the model is trained first and then held static to train the team--developer and bug--developer relationships. The scheme employed to encode the team--developer relationships is based on an organizational chart (org chart), which renders the model robust to organizational changes as it can adapt to role changes within an organization. There is an observed average lift (with respect to both team and developer assignment) of 13%-points in 11-fold incremental-learning cross-validation (IL-CV) accuracy for Dual DNN utilizing owner-weighted labels compared with the traditional multi-class classification approach. Furthermore, Dual DNN with owner-weighted labels achieves average 11-fold IL-CV accuracies of 76% (team assignment) and 55% (developer assignment), outperforming reference models by 14%- and 25%-points, respectively, on a proprietary dataset with 236,865 entries.Comment: 8 pages, 2 figures, 9 table

    From Bugs to Decision Support – Leveraging Historical Issue Reports in Software Evolution

    Software developers in large projects work in complex information landscapes and staying on top of all relevant software artifacts is an acknowledged challenge. As software systems often evolve over many years, a large number of issue reports is typically managed during the lifetime of a system, representing the units of work needed for its improvement, e.g., defects to fix, requested features, or missing documentation. Efficient management of incoming issue reports requires the successful navigation of the information landscape of a project. In this thesis, we address two tasks involved in issue management: Issue Assignment (IA) and Change Impact Analysis (CIA). IA is the early task of allocating an issue report to a development team, and CIA is the subsequent activity of identifying how source code changes affect the existing software artifacts. While IA is fundamental in all large software projects, CIA is particularly important to safety-critical development. Our solution approach, grounded on surveys of industry practice as well as scientific literature, is to support navigation by combining information retrieval and machine learning into Recommendation Systems for Software Engineering (RSSE). While the sheer number of incoming issue reports might challenge the overview of a human developer, our techniques instead benefit from the availability of ever-growing training data. We leverage the volume of issue reports to develop accurate decision support for software evolution. We evaluate our proposals both by deploying an RSSE in two development teams, and by simulation scenarios, i.e., we assess the correctness of the RSSEs' output when replaying the historical inflow of issue reports. In total, more than 60,000 historical issue reports are involved in our studies, originating from the evolution of five proprietary systems for two companies. Our results show that RSSEs for both IA and CIA can help developers navigate large software projects, in terms of locating development teams and software artifacts. Finally, we discuss how to support the transfer of our results to industry, focusing on addressing the context dependency of our tool support by systematically tuning parameters to a specific operational setting

    Bug Triaging with High Confidence Predictions

    Correctly assigning bugs to the right developer or team, i.e., bug triaging, is a costly activity. A concerted effort at Ericsson has been done to adopt automated bug triaging to reduce development costs. We also perform a case study on Eclipse bug reports. In this work, we replicate the research approaches that have been widely used in the literature including FixerCache. We apply them on over 10k bug reports for 9 large products at Ericsson and 2 large Eclipse products containing 21 components. We find that a logistic regression classifier including simple textual and categorical attributes of the bug reports has the highest accuracy of 79.00% and 46% on Ericsson and Eclipse bug reports respectively. Ericsson’s bug reports often contain logs that have crash dumps and alarms. We add this information to the bug triage models. We find that this information does not improve the accuracy of bug triaging in Ericsson’s context. Eclipse bug reports contain the stack traces that we add to the bug triaging model. Stack traces are only present in 8% of bug reports and do not improve the triage accuracy. Although our models perform as well as the best ones reported in the literature, a criticism of bug triaging at Ericsson is that accuracy is not sufficient for regular use. We develop a novel approach that only triages bugs when the model has high confidence in the triage prediction. We find that we improve the accuracy to 90% at Ericsson and 70% at Eclipse, but we can make predictions for 62% and 25% of the total Ericsson and Eclipse bug reports,respectively

    Using Screenshot Attachments in Issue Reports for Triaging

    In previous work, we deployed IssueTAG, which uses the texts present in the one-line summary and the description fields of the issue reports to automatically assign them to the stakeholders, who are responsible for resolving the reported issues. Since its deployment on January 12, 2018 at Softtech, i.e., the software subsidiary of the largest private bank in Turkey, IssueTAG has made a total of 301,752 assignments (as of November 2021). One observation we make is that a large fraction of the issue reports submitted to Softtech has screenshot attachments and, in the presence of such attachments, the reports often convey less information in their one-line summary and the description fields, which tends to reduce the assignment accuracy. In this work, we use the screenshot attachments as an additional source of information to further improve the assignment accuracy, which (to the best of our knowledge) has not been studied before in this context. In particular, we develop a number of multi-source (using both the issue reports and the screenshot attachments) and single-source assignment models (using either the issue reports or the screenshot attachments) and empirically evaluate them on real issue reports. In the experiments, compared to the currently deployed single-source model in the field, the best multi-source model developed in this work, significantly (both in the practical and statistical sense) improved the assignment accuracy for the issue reports with screenshot attachments from 0.843 to 0.858 at acceptable overhead costs, a result strongly supporting our basic hypothesis.Comment: Preprint for EMSE journa

    Aspect of Code Cloning Towards Software Bug and Imminent Maintenance: A Perspective on Open-source and Industrial Mobile Applications

    As a part of the digital era of microtechnology, mobile application (app) development is evolving with lightning speed to enrich our lives and bring new challenges and risks. In particular, software bugs and failures cost trillions of dollars every year, including fatalities such as a software bug in a self-driving car that resulted in a pedestrian fatality in March 2018 and the recent Boeing-737 Max tragedies that resulted in hundreds of deaths. Software clones (duplicated fragments of code) are also found to be one of the crucial factors for having bugs or failures in software systems. There have been many significant studies on software clones and their relationships to software bugs for desktop-based applications. Unfortunately, while mobile apps have become an integral part of today’s era, there is a marked lack of such studies for mobile apps. In order to explore this important aspect, in this thesis, first, we studied the characteristics of software bugs in the context of mobile apps, which might not be prevalent for desktop-based apps such as energy-related (battery drain while using apps) and compatibility-related (different behaviors of same app in different devices) bugs/issues. Using Support Vector Machine (SVM), we classified about 3K mobile app bug reports of different open-source development sites into four categories: crash, energy, functionality and security bug. We then manually examined a subset of those bugs and found that over 50% of the bug-fixing code-changes occurred in clone code. There have been a number of studies with desktop-based software systems that clearly show the harmful impacts of code clones and their relationships to software bugs. Given that there is a marked lack of such studies for mobile apps, in our second study, we examined 11 open-source and industrial mobile apps written in two different languages (Java and Swift) and noticed that clone code is more bug-prone than non-clone code and that industrial mobile apps have a higher code clone ratio than open-source mobile apps. Furthermore, we correlated our study outcomes with those of existing desktop based studies and surveyed 23 mobile app developers to validate our findings. Along with validating our findings from the survey, we noticed that around 95% of the developers usually copy/paste (code cloning) code fragments from the popular Crowd-sourcing platform, Stack Overflow (SO) to their projects and that over 75% of such developers experience bugs after such activities (the code cloning from SO). Existing studies with desktop-based systems also showed that while SO is one of the most popular online platforms for code reuse (and code cloning), SO code fragments are usually toxic in terms of software maintenance perspective. Thus, in the third study of this thesis, we studied the consequences of code cloning from SO in different open source and industrial mobile apps. We observed that closed-source industrial apps even reused more SO code fragments than open-source mobile apps and that SO code fragments were more change-prone (such as bug) than non-SO code fragments. We also experienced that SO code fragments were related to more bugs in industrial projects than open-source ones. Our studies show how we could efficiently and effectively manage clone related software bugs for mobile apps by utilizing the positive sides of code cloning while overcoming (or at least minimizing) the negative consequences of clone fragments

    Studying the lives of software bugs

    For as long as people have made software, they have made mistakes in that software. Software bugs are widespread, and the maintenance required to fix them has a major impact on the cost of software and how developers' time is spent. Reducing this maintenance time would lower the cost of software and allow for developers to spend more time on new features, improving the software for end-users. Bugs are hugely diverse and have a complex life cycle. This makes them difficult to study, and research is often carried out on synthetic bugs or toy programs. However, a better understanding of the bug life cycle would greatly aid in developing tools to reduce the time spent on maintenance. This thesis will study the life cycle of bugs, and develop such an understanding. Overall, this thesis examines over 3000 real bugs, from real projects, concentrating on three of the most important points in the life cycle: origin, reporting and fix. Firstly, two existing techniques are compared for discovering the origin of a bug. A number of improvements are evaluated, and the most effective approach is found to be combining the techniques. Furthermore, the behaviour of developers is found to have a major impact on the accuracy of the techniques. Secondly, a large number of bugs are analysed to determine what information is provided when users report bugs. For most bugs, much important information is missing, or inaccurate. Most importantly, there appears to be a considerable gap between what users provide and what developers actually want. Finally, an evaluation is carried out on a number of novel alterations to techniques used to determine the location of bug fixes. Compared to existing techniques, these alterations successfully increase the number of bugs which can be usefully localised, aiding developers in removing the bugs.

    Identification of Software Features in Issue Tracking System Data

    The knowledge of Software Features (SFs) is vital for software developers and requirements specialists during all software engineering phases: to understand and derive software requirements, to plan and prioritize implementation tasks, to update documentation, or to test whether the final product correctly implements the requested SF. In most software projects, SFs are managed in conjunction with other information such as bug reports, programming tasks, or refactoring tasks with the aid of Issue Tracking Systems (ITSs). Hence ITSs contains a variety of information that is only partly related to SFs. In practice, however, the usage of ITSs to store SFs comes with two major problems: (1) ITSs are neither designed nor used as documentation systems. Therefore, the data inside an ITS is often uncategorized and SF descriptions are concealed in rather lengthy. (2) Although an SF is often requested in a single sentence, related information can be scattered among many issues. E.g. implementation tasks related to an SF are often reported in additional issues. Hence, the detection of SFs in ITSs is complicated: a manual search for the SFs implies reading, understanding and exploiting the Natural Language (NL) in many issues in detail. This is cumbersome and labor intensive, especially if related information is spread over more than one issue. This thesis investigates whether SF detection can be supported automatically. First the problem is analyzed: (i) An empirical study shows that requests for important SFs reside in ITSs, making ITSs a good tar- get for SF detection. (ii) A second study identifies characteristics of the information and related NL in issues. These characteristics repre- sent opportunities as well as challenges for the automatic detection of SFs. Based on these problem studies, the Issue Tracking Software Feature Detection Method (ITSoFD), is proposed. The method has two main components and includes an approach to preprocess issues. Both components address one of the problems associated with storing SFs in ITSs. ITSoFD is validated in three solution studies: (I) An empirical study researches how NL that describes SFs can be detected with techniques from Natural Language Processing (NLP) and Machine Learning. Issues are parsed and different characteristics of the issue and its NL are extracted. These characteristics are used to clas- sify the issue’s content and identify SF description candidates, thereby approaching problem (1). (II) An empirical study researches how issues that carry information potentially related to an SF can be detected with techniques from NLP and Information Retrieval. Characteristics of the issue’s NL are utilized to create a traceability network vii of related issues, thereby approaching problem (2). (III) An empirical study researches how NL data in issues can be preprocessed using heuristics and hierarchical clustering. Code, stack traces, and other technical information is separated from NL. Heuristics are used to identify candidates for technical information and clustering improves the heuristic’s results. The technique can be applied to support components, I. and II