472 research outputs found

    Analysis and Detection of Information Types of Open Source Software Issue Discussions

    Full text link
    Most modern Issue Tracking Systems (ITSs) for open source software (OSS) projects allow users to add comments to issues. Over time, these comments accumulate into discussion threads embedded with rich information about the software project, which can potentially satisfy the diverse needs of OSS stakeholders. However, discovering and retrieving relevant information from the discussion threads is a challenging task, especially when the discussions are lengthy and the number of issues in ITSs are vast. In this paper, we address this challenge by identifying the information types presented in OSS issue discussions. Through qualitative content analysis of 15 complex issue threads across three projects hosted on GitHub, we uncovered 16 information types and created a labeled corpus containing 4656 sentences. Our investigation of supervised, automated classification techniques indicated that, when prior knowledge about the issue is available, Random Forest can effectively detect most sentence types using conversational features such as the sentence length and its position. When classifying sentences from new issues, Logistic Regression can yield satisfactory performance using textual features for certain information types, while falling short on others. Our work represents a nontrivial first step towards tools and techniques for identifying and obtaining the rich information recorded in the ITSs to support various software engineering activities and to satisfy the diverse needs of OSS stakeholders.Comment: 41st ACM/IEEE International Conference on Software Engineering (ICSE2019

    Intent classification for a management conversational assistant

    Get PDF
    Intent classification is an essential step in processing user input to a conversational assistant. This work investigates techniques of intent classification of chat messages used for communication among software development teams with the aim of building an intent classifier for a management conversational assistant integrated into modern communication platforms used by developers. Experiments conducted using rule-based and common ML techniques have shown that careful choice of classification features has a significant impact on performance, and the best performing model was able to obtain a classification accuracy of 72%. A set of techniques for extracting useful features for text classification in the software engineering domain was also implemented and tested

    Bots in software engineering: a systematic mapping study

    Get PDF
    Bots have emerged from research prototypes to deployable systems due to the recent developments in machine learning, natural language processing and understanding techniques. In software engineering, bots range from simple automated scripts to decision-making autonomous systems. The spectrum of applications of bots in software engineering is so wide and diverse, that a comprehensive overview and categorization of such bots is needed. Existing works considered selective bots to be analyzed and failed to provide the overall picture. Hence it is significant to categorize bots in software engineering through analyzing why, what and how the bots are applied in software engineering. We approach the problem with a systematic mapping study based on the research articles published in this topic. This study focuses on classification of bots used in software engineering, the various dimensions of the characteristics, the more frequently researched area, potential research spaces to be explored and the perception of bots in the developer community. This study aims to provide an introduction and a broad overview of bots used in software engineering. Discussions of the feedback and results from several studies provide interesting insights and prospective future directions

    Towards the Humanisation of Programming Tool Interactions

    Get PDF
    Program analysis tools, from simple static semantic analysis by a compiler, to complex dynamic analyses of data flow and security, have become commonplace in modern day programming. Many of the simpler analyses, such as the afore- mentioned compiler checking or linters designed to enforce code style, may even go unnoticed or unconsidered by most users, ubiquitous as they are. Despite this, and despite the obvious utility that such programming tools can provide, many warnings provided by them go unheeded by programmers most of the time.There are several reasons for this phenomenon: the propensity to produce false positives undermines confidence in the validity of warnings, the tools do not in- tegrate well into the normal workflow of the developer, sometimes the warning message is simply too esoteric for most users to understand, and so on. A com- mon theme can be drawn from these reasons for ignoring the often-times very useful information given by a programming tool: the tool itself is difficult to use.In this thesis, we consider ways in which we can bridge this gap between users and tools. To do this, we draw from observations about the way in which we interact with each other in the most basic human-to-human context. Applying these lessons to a human-tool interaction allow us to examine ways in which tools may be deficient, and investigate methods for making the interaction more natural and human-like.We explore this issue by framing the interaction as a "conversation" between a human and their development environment. We then present a new programming tool, Progger, built using design principles driven by the "conversational lens" which we use to look at these interactions. After this, we present a user study using a novel low-cost methodology, aimed at evaluating the efficacy of the Progger tool. From the results of this user study, we present a new, more streamlined version of Progger, and finally investigate the way in which it can be used to direct the users attention when conducting a code comprehension exercise

    Endless Data

    Get PDF
    Small and Medium Enterprises (SMEs), as well as micro teams, face an uphill task when delivering software to the Cloud. While rapid release methods such as Continuous Delivery can speed up the delivery cycle: software quality, application uptime and information management remain key concerns. This work looks at four aspects of software delivery: crowdsourced testing, Cloud outage modelling, collaborative chat discourse modelling, and collaborative chat discourse segmentation. For each aspect, we consider business related questions around how to improve software quality and gain more significant insights into collaborative data while respecting the rapid release paradigm
    • …
    corecore