360 research outputs found
How to Ask for Technical Help? Evidence-based Guidelines for Writing Questions on Stack Overflow
Context: The success of Stack Overflow and other community-based
question-and-answer (Q&A) sites depends mainly on the will of their members to
answer others' questions. In fact, when formulating requests on Q&A sites, we
are not simply seeking for information. Instead, we are also asking for other
people's help and feedback. Understanding the dynamics of the participation in
Q&A communities is essential to improve the value of crowdsourced knowledge.
Objective: In this paper, we investigate how information seekers can increase
the chance of eliciting a successful answer to their questions on Stack
Overflow by focusing on the following actionable factors: affect, presentation
quality, and time.
Method: We develop a conceptual framework of factors potentially influencing
the success of questions in Stack Overflow. We quantitatively analyze a set of
over 87K questions from the official Stack Overflow dump to assess the impact
of actionable factors on the success of technical requests. The information
seeker reputation is included as a control factor. Furthermore, to understand
the role played by affective states in the success of questions, we
qualitatively analyze questions containing positive and negative emotions.
Finally, a survey is conducted to understand how Stack Overflow users perceive
the guideline suggestions for writing questions.
Results: We found that regardless of user reputation, successful questions
are short, contain code snippets, and do not abuse with uppercase characters.
As regards affect, successful questions adopt a neutral emotional style.
Conclusion: We provide evidence-based guidelines for writing effective
questions on Stack Overflow that software engineers can follow to increase the
chance of getting technical help. As for the role of affect, we empirically
confirmed community guidelines that suggest avoiding rudeness in question
writing.Comment: Preprint, to appear in Information and Software Technolog
An Exploratory Study of Documentation Strategies for Product Features in Popular GitHub Projects
[Background] In large open-source software projects, development knowledge is
often fragmented across multiple artefacts and contributors such that
individual stakeholders are generally unaware of the full breadth of the
product features. However, users want to know what the software is capable of,
while contributors need to know where to fix, update, and add features.
[Objective] This work aims at understanding how feature knowledge is documented
in GitHub projects and how it is linked (if at all) to the source code.
[Method] We conducted an in-depth qualitative exploratory content analysis of
25 popular GitHub repositories that provided the documentation artefacts
recommended by GitHub's Community Standards indicator. We first extracted
strategies used to document software features in textual artefacts and then
strategies used to link the feature documentation with source code. [Results]
We observed feature documentation in all studied projects in artefacts such as
READMEs, wikis, and website resource files. However, the features were often
described in an unstructured way. Additionally, tracing techniques to connect
feature documentation and source code were rarely used. [Conclusions] Our
results suggest a lacking (or a low-prioritised) feature documentation in
open-source projects, little use of normalised structures, and a rare explicit
referencing to source code. As a result, product feature traceability is likely
to be very limited, and maintainability to suffer over time.Comment: Accepted for the New Ideas and Emerging Results (NIER) track of the
38th IEEE International Conference on Software Maintenance and Evolution
(ICSME
A Decade of Code Comment Quality Assessment: A Systematic Literature Review
Code comments are important artifacts in software systems and play a
paramount role in many software engineering (SE) tasks related to maintenance
and program comprehension. However, while it is widely accepted that high
quality matters in code comments just as it matters in source code, assessing
comment quality in practice is still an open problem. First and foremost, there
is no unique definition of quality when it comes to evaluating code comments.
The few existing studies on this topic rather focus on specific attributes of
quality that can be easily quantified and measured. Existing techniques and
corresponding tools may also focus on comments bound to a specific programming
language, and may only deal with comments with specific scopes and clear goals
(e.g., Javadoc comments at the method level, or in-body comments describing
TODOs to be addressed). In this paper, we present a Systematic Literature
Review (SLR) of the last decade of research in SE to answer the following
research questions: (i) What types of comments do researchers focus on when
assessing comment quality? (ii) What quality attributes (QAs) do they consider?
(iii) Which tools and techniques do they use to assess comment quality?, and
(iv) How do they evaluate their studies on comment quality assessment in
general? Our evaluation, based on the analysis of 2353 papers and the actual
review of 47 relevant ones, shows that (i) most studies and techniques focus on
comments in Java code, thus may not be generalizable to other languages, and
(ii) the analyzed studies focus on four main QAs of a total of 21 QAs
identified in the literature, with a clear predominance of checking consistency
between comments and the code. We observe that researchers rely on manual
assessment and specific heuristics rather than the automated assessment of the
comment quality attributes
A Benchmark Study on Sentiment Analysis for Software Engineering Research
A recent research trend has emerged to identify developers' emotions, by
applying sentiment analysis to the content of communication traces left in
collaborative development environments. Trying to overcome the limitations
posed by using off-the-shelf sentiment analysis tools, researchers recently
started to develop their own tools for the software engineering domain. In this
paper, we report a benchmark study to assess the performance and reliability of
three sentiment analysis tools specifically customized for software
engineering. Furthermore, we offer a reflection on the open challenges, as they
emerge from a qualitative analysis of misclassified texts.Comment: Proceedings of 15th International Conference on Mining Software
Repositories (MSR 2018
Opinion Mining for Software Development: A Systematic Literature Review
Opinion mining, sometimes referred to as sentiment analysis, has gained increasing attention in software engineering (SE) studies.
SE researchers have applied opinion mining techniques in various contexts, such as identifying developers’ emotions expressed in
code comments and extracting users’ critics toward mobile apps. Given the large amount of relevant studies available, it can take
considerable time for researchers and developers to figure out which approaches they can adopt in their own studies and what perils
these approaches entail.
We conducted a systematic literature review involving 185 papers. More specifically, we present 1) well-defined categories of opinion
mining-related software development activities, 2) available opinion mining approaches, whether they are evaluated when adopted in
other studies, and how their performance is compared, 3) available datasets for performance evaluation and tool customization, and 4)
concerns or limitations SE researchers might need to take into account when applying/customizing these opinion mining techniques.
The results of our study serve as references to choose suitable opinion mining tools for software development activities, and provide
critical insights for the further development of opinion mining techniques in the SE domain
APICom: Automatic API Completion via Prompt Learning and Adversarial Training-based Data Augmentation
Based on developer needs and usage scenarios, API (Application Programming
Interface) recommendation is the process of assisting developers in finding the
required API among numerous candidate APIs. Previous studies mainly modeled API
recommendation as the recommendation task, which can recommend multiple
candidate APIs for the given query, and developers may not yet be able to find
what they need. Motivated by the neural machine translation research domain, we
can model this problem as the generation task, which aims to directly generate
the required API for the developer query. After our preliminary investigation,
we find the performance of this intuitive approach is not promising. The reason
is that there exists an error when generating the prefixes of the API. However,
developers may know certain API prefix information during actual development in
most cases. Therefore, we model this problem as the automatic completion task
and propose a novel approach APICom based on prompt learning, which can
generate API related to the query according to the prompts (i.e., API prefix
information). Moreover, the effectiveness of APICom highly depends on the
quality of the training dataset. In this study, we further design a novel
gradient-based adversarial training method {\atpart} for data augmentation,
which can improve the normalized stability when generating adversarial
examples. To evaluate the effectiveness of APICom, we consider a corpus of 33k
developer queries and corresponding APIs. Compared with the state-of-the-art
baselines, our experimental results show that APICom can outperform all
baselines by at least 40.02\%, 13.20\%, and 16.31\% in terms of the performance
measures EM@1, MRR, and MAP. Finally, our ablation studies confirm the
effectiveness of our component setting (such as our designed adversarial
training method, our used pre-trained model, and prompt learning) in APICom.Comment: accepted in Internetware 202
- …