726 research outputs found

    Retrieving curated Stack Overflow Posts of similar project tasks

    Get PDF
    Software development depends on diverse technologies and methods and as a result, software development teams often handle issues in which team members are not experts. In order to address this lack of expertise, developers typically search for information on web-based Q&A sites such as Stack Overflow, a well-known place to find solutions to specific technology-related problems. Access to these web-based Q&A locations is currently not integrated into the software development environment, and since the associations between software development projects and the supporting sources of known solutions, usually referred to as knowledge, is not explicitly recorded, software developers often need to search for solutions to similar recurring issues multiple times. This lack of integration hinders the reuse of the knowledge obtained, besides not avoiding efforts of search and selection, curation, of this knowledge over and over again. This research aims at proposing a study regarding explicitly associating project elements (such as project tasks) to Stack Overflow posts that have already been curated by developers, and presents a study about Stack Overflow posts suggestions to developers based on similarity of project tasks.O desenvolvimento de software depende de diversas tecnologias e métodos e, como resultado, as equipes de desenvolvimento de software geralmente lidam com problemas em que não são especialistas. Para lidar com a falta de conhecimento, desenvolvedores normalmente procuram informações em sites de perguntas e respostas, como o Stack Overflow, um site usado para encontrar soluções para problemas específicos relacionados à tecnologia. O acesso a esses sites não é integrado ao ambiente de desenvolvimento de software e porque as associações entre os projetos de desenvolvimento de software e as fontes de suporte de soluções conhecidas não são explicitamente registradas. Com isso, desenvolvedores de software podem investir um esforço em procurar soluções para problemas semelhantes várias vezes. Essa falta de integração dificulta o reuso do conhecimento obtido, além de não evitar esforços de busca e seleção, a curadoria, repetidas vezes. Esta pesquisa tem como objetivo realizar um estudo sobre a associação explicita entre elementos do projeto (como tarefas de projeto) a publicações do Stack Overflow que já sofreram curadoria por desenvolvedores, e apresenta um estudo sobre sugestões de publicações do Stack Overflow a desenvolvedores com base na similaridade de tarefas de projeto

    "Always Nice and Confident, Sometimes wrong": Developer's Experiences Engaging Generative AI Chatbots Versus Human-Powered Q&A Platforms

    Full text link
    Software engineers have historically relied on human-powered Q&A platforms, like Stack Overflow (SO), as coding aids. With the rise of generative AI, developers have adopted AI chatbots, such as ChatGPT, in their software development process. Recognizing the potential parallels between human-powered Q&A platforms and AI-powered question-based chatbots, we investigate and compare how developers integrate this assistance into their real-world coding experiences by conducting thematic analysis of Reddit posts. Through a comparative study of SO and ChatGPT, we identified each platform's strengths, use cases, and barriers. Our findings suggest that ChatGPT offers fast, clear, comprehensive responses and fosters a more respectful environment than SO. However, concerns about ChatGPT's reliability stem from its overly confident tone and the absence of validation mechanisms like SO's voting system. Based on these findings, we recommend leveraging each platform's unique features to improve developer experiences in the future

    Holistic recommender systems for software engineering

    Get PDF
    The knowledge possessed by developers is often not sufficient to overcome a programming problem. Short of talking to teammates, when available, developers often gather additional knowledge from development artifacts (e.g., project documentation), as well as online resources. The web has become an essential component in the modern developer’s daily life, providing a plethora of information from sources like forums, tutorials, Q&A websites, API documentation, and even video tutorials. Recommender Systems for Software Engineering (RSSE) provide developers with assistance to navigate the information space, automatically suggest useful items, and reduce the time required to locate the needed information. Current RSSEs consider development artifacts as containers of homogeneous information in form of pure text. However, text is a means to represent heterogeneous information provided by, for example, natural language, source code, interchange formats (e.g., XML, JSON), and stack traces. Interpreting the information from a pure textual point of view misses the intrinsic heterogeneity of the artifacts, thus leading to a reductionist approach. We propose the concept of Holistic Recommender Systems for Software Engineering (H-RSSE), i.e., RSSEs that go beyond the textual interpretation of the information contained in development artifacts. Our thesis is that modeling and aggregating information in a holistic fashion enables novel and advanced analyses of development artifacts. To validate our thesis we developed a framework to extract, model and analyze information contained in development artifacts in a reusable meta- information model. We show how RSSEs benefit from a meta-information model, since it enables customized and novel analyses built on top of our framework. The information can be thus reinterpreted from an holistic point of view, preserving its multi-dimensionality, and opening the path towards the concept of holistic recommender systems for software engineering

    We Are...Marshall, January 20, 2021

    Get PDF

    Using multiclass classification algorithms to improve text categorization tool:NLoN

    Get PDF
    Abstract. Natural language processing (NLP) and machine learning techniques have been widely utilized in the mining software repositories (MSR) field in recent years. Separating natural language from source code is a pre-processing step that is needed in both NLP and the MSR domain for better data quality. This paper presents the design and implementation of a multi-class classification approach that is based on the existing open-source R package Natural Language or Not (NLoN). This article also reviews the existing literature on MSR and NLP. The review classified the information sources and approaches of MSR in detail, and also focused on the text representation and classification tasks of NLP. In addition, the design and implementation methods of the original paper are briefly introduced. Regarding the research methodology, since the research goal is technology-oriented, i.e., to improve the design and implementation of existing technologies, this article adopts the design science research methodology and also describes how the methodology was adopted. This research implements an open-source Python library, namely NLoN-PY. This is an open-source library hosted on GitHub, and users can also directly use the tools published to the PyPI library. Since NLoN has achieved comparable performance on two-class classification tasks with the Lasso regression model, this study evaluated other multi-class classification algorithms, i.e., Naive Bayes, k-Nearest Neighbours, and Support Vector Machine. Using 10-fold cross-validation, the expanded classifier achieved AUC performance of 0.901 for the 5-class classification task and the AUC performance of 0.92 for the 2-class task. Although the design of this study did not show a significant performance improvement compared to the original design, the impact of unbalanced data distribution on performance was detected and the category of the classification problem was also refined in the process. These findings on the multi-class classification design can provide a research foundation or direction for future research

    Assessing Comment Quality in Object-Oriented Languages

    Get PDF
    Previous studies have shown that high-quality code comments support developers in software maintenance and program comprehension tasks. However, the semi-structured nature of comments, several conventions to write comments, and the lack of quality assessment tools for all aspects of comments make comment evaluation and maintenance a non-trivial problem. To understand the specification of high-quality comments to build effective assessment tools, our thesis emphasizes acquiring a multi-perspective view of the comments, which can be approached by analyzing (1) the academic support for comment quality assessment, (2) developer commenting practices across languages, and (3) developer concerns about comments. Our findings regarding the academic support for assessing comment quality showed that researchers primarily focus on Java in the last decade even though the trend of using polyglot environments in software projects is increasing. Similarly, the trend of analyzing specific types of code comments (method comments, or inline comments) is increasing, but the studies rarely analyze class comments. We found 21 quality attributes that researchers consider to assess comment quality, and manual assessment is still the most commonly used technique to assess various quality attributes. Our analysis of developer commenting practices showed that developers embed a mixed level of details in class comments, ranging from high-level class overviews to low-level implementation details across programming languages. They follow style guidelines regarding what information to write in class comments but violate the structure and syntax guidelines. They primarily face problems locating relevant guidelines to write consistent and informative comments, verifying the adherence of their comments to the guidelines, and evaluating the overall state of comment quality. To help researchers and developers in building comment quality assessment tools, we contribute: (i) a systematic literature review (SLR) of ten years (2010–2020) of research on assessing comment quality, (ii) a taxonomy of quality attributes used to assess comment quality, (iii) an empirically validated taxonomy of class comment information types from three programming languages, (iv) a multi-programming-language approach to automatically identify the comment information types, (v) an empirically validated taxonomy of comment convention-related questions and recommendation from various Q&A forums, and (vi) a tool to gather discussions from multiple developer sources, such as Stack Overflow, and mailing lists. Our contributions provide various kinds of empirical evidence of the developer’s interest in reducing efforts in the software documentation process, of the limited support developers get in automatically assessing comment quality, and of the challenges they face in writing high-quality comments. This work lays the foundation for future effective comment quality assessment tools and techniques

    Opinion Mining for Software Development: A Systematic Literature Review

    Get PDF
    Opinion mining, sometimes referred to as sentiment analysis, has gained increasing attention in software engineering (SE) studies. SE researchers have applied opinion mining techniques in various contexts, such as identifying developers’ emotions expressed in code comments and extracting users’ critics toward mobile apps. Given the large amount of relevant studies available, it can take considerable time for researchers and developers to figure out which approaches they can adopt in their own studies and what perils these approaches entail. We conducted a systematic literature review involving 185 papers. More specifically, we present 1) well-defined categories of opinion mining-related software development activities, 2) available opinion mining approaches, whether they are evaluated when adopted in other studies, and how their performance is compared, 3) available datasets for performance evaluation and tool customization, and 4) concerns or limitations SE researchers might need to take into account when applying/customizing these opinion mining techniques. The results of our study serve as references to choose suitable opinion mining tools for software development activities, and provide critical insights for the further development of opinion mining techniques in the SE domain
    • …
    corecore