67 research outputs found
NLP2Code: Code Snippet Content Assist via Natural Language Tasks
Developers increasingly take to the Internet for code snippets to integrate
into their programs. To save developers the time required to switch from their
development environments to a web browser in the quest for a suitable code
snippet, we introduce NLP2Code, a content assist for code snippets. Unlike
related tools, NLP2Code integrates directly into the source code editor and
provides developers with a content assist feature to close the vocabulary gap
between developers' needs and code snippet meta data. Our preliminary
evaluation of NLP2Code shows that the majority of invocations lead to code
snippets rated as helpful by users and that the tool is able to support a wide
range of tasks.Comment: tool demo video available at
https://www.youtube.com/watch?v=h-gaVYtCznI; to appear as a tool demo paper
at ICSME 2017 (https://icsme2017.github.io/
Is Stack Overflow Overflowing With Questions and Tags
Programming question and answer (Q & A) websites, such as Quora, Stack
Overflow, and Yahoo! Answer etc. helps us to understand the programming
concepts easily and quickly in a way that has been tested and applied by many
software developers. Stack Overflow is one of the most frequently used
programming Q\&A website where the questions and answers posted are presently
analyzed manually, which requires a huge amount of time and resource. To save
the effort, we present a topic modeling based technique to analyze the words of
the original texts to discover the themes that run through them. We also
propose a method to automate the process of reviewing the quality of questions
on Stack Overflow dataset in order to avoid ballooning the stack overflow with
insignificant questions. The proposed method also recommends the appropriate
tags for the new post, which averts the creation of unnecessary tags on Stack
Overflow.Comment: 11 pages, 7 figures, 3 tables Presented at Third International
Symposium on Women in Computing and Informatics (WCI-2015
On Using Active Learning and Self-Training when Mining Performance Discussions on Stack Overflow
Abundant data is the key to successful machine learning. However, supervised
learning requires annotated data that are often hard to obtain. In a
classification task with limited resources, Active Learning (AL) promises to
guide annotators to examples that bring the most value for a classifier. AL can
be successfully combined with self-training, i.e., extending a training set
with the unlabelled examples for which a classifier is the most certain. We
report our experiences on using AL in a systematic manner to train an SVM
classifier for Stack Overflow posts discussing performance of software
components. We show that the training examples deemed as the most valuable to
the classifier are also the most difficult for humans to annotate. Despite
carefully evolved annotation criteria, we report low inter-rater agreement, but
we also propose mitigation strategies. Finally, based on one annotator's work,
we show that self-training can improve the classification accuracy. We conclude
the paper by discussing implication for future text miners aspiring to use AL
and self-training.Comment: Preprint of paper accepted for the Proc. of the 21st International
Conference on Evaluation and Assessment in Software Engineering, 201
- …