10 research outputs found

    Data-Driven Application Maintenance: Views from the Trenches

    Full text link
    In this paper we present our experience during design, development, and pilot deployments of a data-driven machine learning based application maintenance solution. We implemented a proof of concept to address a spectrum of interrelated problems encountered in application maintenance projects including duplicate incident ticket identification, assignee recommendation, theme mining, and mapping of incidents to business processes. In the context of IT services, these problems are frequently encountered, yet there is a gap in bringing automation and optimization. Despite long-standing research around mining and analysis of software repositories, such research outputs are not adopted well in practice due to the constraints these solutions impose on the users. We discuss need for designing pragmatic solutions with low barriers to adoption and addressing right level of complexity of problems with respect to underlying business constraints and nature of data.Comment: Earlier version of paper appearing in proceedings of the 4th International Workshop on Software Engineering Research and Industrial Practice (SER&IP), IEEE Press, pp. 48-54, 201

    MultiDimEr: a multi-dimensional bug analyzEr

    Full text link
    Background: Bugs and bug management consumes a significant amount of time and effort from software development organizations. A reduction in bugs can significantly improve the capacity for new feature development. Aims: We categorize and visualize dimensions of bug reports to identify accruing technical debt. This evidence can serve practitioners and decision makers not only as an argumentative basis for steering improvement efforts, but also as a starting point for root cause analysis, reducing overall bug inflow. Method: We implemented a tool, MultiDimEr, that analyzes and visualizes bug reports. The tool was implemented and evaluated at Ericsson. Results: We present our preliminary findings using the MultiDimEr for bug analysis, where we successfully identified components generating most of the bugs and bug trends within certain components. Conclusions: By analyzing the dimensions provided by MultiDimEr, we show that classifying and visualizing bug reports in different dimensions can stimulate discussions around bug hot spots as well as validating the accuracy of manually entered bug report attributes used in technical debt measurements such as fault slip through.Comment: TechDebt@ICSE 2022: 66-7

    How Usability Defects Defer from Non-Usability Defects? : A Case Study on Open Source Projects

    Get PDF
    Usability is one of the software qualities attributes that is subjective and often considered as a less critical defect to be fixed. One of the reasons was due to the vague defect descriptions that could not convince developers about the validity of usability issues. Producing a comprehensive usability defect description can be a challenging task, especially in reporting relevant and important information. Prior research in improving defect report comprehension has often focused on defects in general or studied various aspects of software quality improvement such as triaging defect reports, metrics and predictions, automatic defect detection and fixing.  In this paper, we studied 2241 usability and non-usability defects from three open-source projects - Mozilla Thunderbird, Firefox for Android, and Eclipse Platform. We examined the presence of eight defect attributes - steps to reproduce, impact, software context, expected output, actual output, assume cause, solution proposal, and supplementary information, and used various statistical tests to answer the research questions. In general, we found that usability defects are resolved slower than non-usability defects, even for non-usability defect reports that have less information. In terms of defect report content, usability defects often contain output details and software context while non-usability defects are preferably explained using supplementary information, such as stack traces and error logs. Our research findings extend the body of knowledge of software defect reporting, especially in understanding the characteristics of usability defects. The promising results also may be valuable to improve software development practitioners' practice

    A Bug Bounty Perspective on the Disclosure of Web Vulnerabilities

    Get PDF
    Bug bounties have become increasingly popular in recent years. This paper discusses bug bounties by framing these theoretically against so-called platform economy. Empirically the interest is on the disclosure of web vulnerabilities through the Open Bug Bounty (OBB) platform between 2015 and late 2017. According to the empirical results based on a dataset covering nearly 160 thousand web vulnerabilities, (i) OBB has been successful as a community-based platform for the dissemination of web vulnerabilities. The platform has also attracted many productive hackers, (ii) but there exists a large productivity gap, which likely relates to (iii) a knowledge gap and the use of automated tools for web vulnerability discovery. While the platform (iv) has been exceptionally fast to evaluate new vulnerability submissions, (v) the patching times of the web vulnerabilities disseminated have been long. With these empirical results and the accompanying theoretical discussion, the paper contributes to the small but rapidly growing amount of research on bug bounties. In addition, the paper makes a practical contribution by discussing the business models behind bug bounties from the viewpoints of platforms, ecosystems, and vulnerability markets.Comment: 17th Annual Workshop on the Economics of Information Security, Innsbruck, https://weis2018.econinfosec.org

    Improved relative discriminative criterion using rare and informative terms and ringed seal search-support vector machine techniques for text classification

    Get PDF
    Classification has become an important task for automatically classifying the documents to their respective categories. For text classification, feature selection techniques are normally used to identify important features and to remove irrelevant, and noisy features for minimizing the dimensionality of feature space. These techniques are expected particularly to improve efficiency, accuracy, and comprehensibility of the classification models in text labeling problems. Most of the feature selection techniques utilize document and term frequencies to rank a term. Existing feature selection techniques (e.g. RDC, NRDC) consider frequently occurring terms and ignore rarely occurring terms count in a class. However, this study proposes the Improved Relative Discriminative Criterion (IRDC) technique which considers rarely occurring terms count. It is argued that rarely occurring terms count are also meaningful and important as frequently occurring terms in a class. The proposed IRDC is compared to the most recent feature selection techniques RDC and NRDC. The results reveal significant improvement by the proposed IRDC technique for feature selection in terms of precision 27%, recall 30%, macro-average 35% and micro- average 30%. Additionally, this study also proposes a hybrid algorithm named: Ringed Seal Search-Support Vector Machine (RSS-SVM) to improve the generalization and learning capability of the SVM. The proposed RSS-SVM optimizes kernel and penalty parameter with the help of RSS algorithm. The proposed RSS-SVM is compared to the most recent techniques GA-SVM and CS-SVM. The results show significant improvement by the proposed RSS-SVM for classification in terms of accuracy 18.8%, recall 15.68%, precision 15.62% and specificity 13.69%. In conclusion, the proposed IRDC has shown better performance as compare to existing techniques because its capability in considering rare and informative terms. Additionally, the proposed RSS- SVM has shown better performance as compare to existing techniques because it has capability to improve balance between exploration and exploitation

    Machine Learning for Software Engineering: A Tertiary Study

    Full text link
    Machine learning (ML) techniques increase the effectiveness of software engineering (SE) lifecycle activities. We systematically collected, quality-assessed, summarized, and categorized 83 reviews in ML for SE published between 2009-2022, covering 6,117 primary studies. The SE areas most tackled with ML are software quality and testing, while human-centered areas appear more challenging for ML. We propose a number of ML for SE research challenges and actions including: conducting further empirical validation and industrial studies on ML; reconsidering deficient SE methods; documenting and automating data collection and pipeline processes; reexamining how industrial practitioners distribute their proprietary data; and implementing incremental ML approaches.Comment: 37 pages, 6 figures, 7 tables, journal articl

    Intelligent Software Bugs Localization, Triage and Prioritization

    Full text link
    One of the time-consuming software maintenance tasks is the localization of software bugs especially in large systems. Developers have to follow a tedious process to reproduce the abnormal behavior then inspect a large number of files in order to resolve the bugs. Furthermore, software developers are usually overwhelmed with several reports of critical bugs to be addressed urgently and simultaneously. The management of these bugs is a complex problem due to the limited resources and the deadlines-pressure. Another critical task in this process is to assign appropriate priority to the bugs and eventually assign them to the right developers for resolution. Several studies have been proposed for bugs localization, the majority of them are recommending classes as outputs which may still require high inspection effort. In addition, there is a significant difference between the natural language used in bug reports and the programming language which limits the efficiency of existing approaches since most of them are mainly based on lexical similarity. Most of the existing studies treated bug reports in isolation when assigning them to developers. They also lack the understanding of dynamics of changing bug priorities. Thus, developers may spend considerable cognitive efforts moving between completely unrelated bug reports. To address these challenges, we proposed the following research contributions: 1. We proposed an automated approach to find and rank the potential classes and methods in order to localize software defects. Our approach finds a good balance between minimizing the number of recommended classes and maximizing the relevance of the proposed solution using a hybrid multi-objective optimization algorithm combining local and global search. Our hybrid multi-objective approach is able to successfully locate the true buggy methods within the top 10 recommendations for over 78% of the bug reports leading to a significant reduction of developers' effort comparing to class-level bug localization techniques. 2. We proposed an automated bugs triage approach based on the dependencies between several open bug reports. We defined the dependency between two bug reports as the number of common files to be inspected to localize the bugs. Then, we adopted multi-objective search to rank the bug reports for programmers. The results show a significant time reduction of over 30% in localizing the bugs simultaneously comparing to the traditional bugs prioritization technique based on only priorities. 3. We performed an empirical study to observe and understand the changes in bugs' priority in order to build a 3-W model on Why and When bug priorities change, and Who performs the change. We conducted interviews and a survey with practitioners as well as performed a quantitative analysis large database of bugs reports. As a result, we observed frequent changes in bug priorities and their impact on delaying critical bug fixes especially before shipping a new release.Ph.D.College of Engineering & Computer ScienceUniversity of Michigan-Dearbornhttp://deepblue.lib.umich.edu/bitstream/2027.42/170906/1/Rafi Almhana Final Dissertation.pdfDescription of Rafi Almhana Final Dissertation.pdf : Dissertatio

    Automated identification and qualitative characterization of safety concerns reported in UAV software platforms

    Get PDF
    Unmanned Aerial Vehicles (UAVs) are nowadays used in a variety of applications. Given the cyber-physical nature of UAVs, software defects in these systems can cause issues with safety-critical implications. An important aspect of the lifecycle of UAV software is to minimize the possibility of harming humans or damaging properties through a continuous process of hazard identification and safety risk management. Specifically, safety-related concerns typically emerge during the operation of UAV systems, reported by end-users and developers in the form of issue reports and pull requests. However, popular UAV systems daily receive tens or hundreds of reports of varying types and quality. To help developers timely identifying and triaging safety-critical UAV issues, we (i) experiment with automated approaches (previously used for issue classification) for detecting the safety-related matters appearing in the titles and descriptions of issues and pull requests reported in UAV platforms, and (ii) propose a categorization of the main hazards and accidents discussed in such issues. Our results (i) show that shallow machine learning-based approaches can identify safety-related sentences with precision, recall, and F-measure values of about 80\%; and (ii) provide a categorization and description of the relationships between safety issue hazards and accidents
    corecore